File size: 1,251 Bytes
a8185f7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: apache-2.0
language:
- en
---
# Model Card for Model ID

This is Meta's Llama 2 7B quantized in 2-bit using AutoGPTQ from Hugging Face Transformers. 
## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->



- **Developed by:** [The Kaitchup](https://kaitchup.substack.com/)
- **Model type:** Causal (Llama 2)
- **Language(s) (NLP):** English
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0), [Llama 2 license agreement](https://ai.meta.com/resources/models-and-libraries/llama-downloads/)

### Model Sources

The method and code used to quantize the model are explained here:
[Quantize and Fine-tune LLMs with GPTQ Using Transformers and TRL](https://kaitchup.substack.com/p/quantize-and-fine-tune-llms-with)

## Uses

This model is pre-trained and not fine-tuned. You may fine-tune it with PEFT using adapters.
Note that the 2-bit quantization significantly decreases the performance of Llama 2.


## Other versions

- [kaitchup/Llama-2-7b-gptq-4bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-4bit)
- [kaitchup/Llama-2-7b-gptq-3bit](https://huggingface.co/kaitchup/Llama-2-7b-gptq-3bit)




## Model Card Contact

[The Kaitchup](https://kaitchup.substack.com/)