--- library_name: peft base_model: codellama/CodeLlama-7b-hf license: llama2 dataset: type: codeparrot/xlcost-text-to-code name: xlcost tags: - code --- # Model Card for Model ID ## Model Details ### Model Description This model is fine-tuned base CodeLlama with C++ code from the 'codeparrot/xlcost-text-to-code' dataset. It can generate C++ code with specific task descriptions. If you get the error "ValueError: Tokenizer class CodeLlamaTokenizer does not exist or is not currently imported." make sure your Transformer version is 4.33.0 and accelerate>=0.20.3. - **Developed by:** [Rudan XIAO] - **Model type:** [code generation] - **License:** [llama2] - **Finetuned from model [optional]:** [codellama/CodeLlama-7b-hf] ### Model Sources [optional] - **Repository:** [https://github.com/medxiaorudan/CodeGeneration] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Direct Use [More Information Needed] ### Downstream Use [optional] [More Information Needed] ### Out-of-Scope Use [More Information Needed] ## Bias, Risks, and Limitations [More Information Needed] ### Recommendations Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. ## How to Get Started with the Model Use the code below to get started with the model. [More Information Needed] ## Training Details ### Training Data https://huggingface.co/datasets/codeparrot/xlcost-text-to-code [More Information Needed] ### Training Procedure The detailed training report is [here](https://wandb.ai/medxiaorudan/CodeLlama_finetune_CPP?workspace=user-medxiaorudan). #### Preprocessing [optional] [More Information Needed] #### Training Hyperparameters - **Training regime:** [bf16] #### Speeds, Sizes, Times [optional] [More Information Needed] ## Evaluation I have use the Catch2 unit test framework for generated C++ code snippets correctness verification.\\ Todo: Use the pass@k metric with the HumanEval-X dataset to verify the performance of the model. ### Testing Data, Factors & Metrics #### Testing Data https://huggingface.co/datasets/THUDM/humaneval-x [More Information Needed] #### Factors [More Information Needed] #### Metrics [More Information Needed] ### Results [More Information Needed] #### Summary ## Model Examination [optional] [More Information Needed] ## Environmental Impact I used 4 NVIDIA A40-48Q GPU server configured with Python 3.10 and Cuda 12.2 to run the code in this article. It ran for about eight hours. Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** [NVIDIA A40-48Q GPU] - **Hours used:** [8] - **Cloud Provider:** [More Information Needed] - **Compute Region:** [More Information Needed] - **Carbon Emitted:** [More Information Needed] ## Technical Specifications [optional] ### Model Architecture and Objective [More Information Needed] ### Compute Infrastructure [More Information Needed] #### Hardware [More Information Needed] #### Software [More Information Needed] ## Citation [optional] **BibTeX:** [More Information Needed] **APA:** [More Information Needed] ## Glossary [optional] [More Information Needed] ## More Information [optional] [More Information Needed] ## Model Card Authors [optional] [More Information Needed] ## Model Card Contact [More Information Needed] ### Framework versions - PEFT 0.7.1