LsTam
/

Mistral-7B-Instruct-v0.1-8bit

Text Generation

text-generation-inference

Inference Endpoints

8-bit precision

Model card Files Files and versions Community

LsTam commited on Dec 14, 2023

Commit

cdf4c11

•

1 Parent(s): 1cb066f

Create README.md

Files changed (1) hide show

README.md +36 -0

README.md ADDED Viewed

	@@ -0,0 +1,36 @@

+# Model Card for Mistral-7B-Instruct-v0.1-8bit
+The Mistral-7B-Instruct-v0.1-8bit is a 8bit quantize version with torch_dtype=torch.float16, I just load in 8bit and push here [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
+For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/la-plateforme/).
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_name = "mistralai/Mistral-7B-Instruct-v0.1"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    load_in_8bit=True,
+    use_flash_attention_2=True,
+    torch_dtype=torch.float16,
+    )
+model.push_to_hub("LsTam/Mistral-7B-Instruct-v0.1-8bit")
+```
+To use it:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tok_name = "mistralai/Mistral-7B-Instruct-v0.2"
+model_name = "LsTam/Mistral-7B-Instruct-v0.1-8bit"
+tokenizer = AutoTokenizer.from_pretrained(tok_name)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    use_flash_attention_2=True,
+    )
+```