aberrio commited on
Commit
3cea8fc
1 Parent(s): 3dd6374

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -0
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ license_link: https://github.com/mistralai/mistral-common/blob/main/LICENCE
4
+ library: llama.cpp
5
+ library_link: https://github.com/ggerganov/llama.cpp
6
+ base_model:
7
+ - mistralai/Mixtral-8x7B-v0.1
8
+ language:
9
+ - fr
10
+ - it
11
+ - de
12
+ - es
13
+ - en
14
+ pipeline_tag: text-generation
15
+ tags:
16
+ - nlp
17
+ - code
18
+ - gguf
19
+ - sparse
20
+ - mixture-of-experts
21
+ - code-generation
22
+ ---
23
+
24
+ ## Mixtral 8x7B Instruct v0.1
25
+
26
+ ### Quantized Model Files
27
+
28
+ The Mixtral 8x7B Sparse Mixture of Experts (SMoE) model is available in two formats:
29
+
30
+ - **ggml-model-q4_0.gguf**: 4-bit quantization for reduced memory and compute overhead.
31
+ - **ggml-model-q8_0.gguf**: 8-bit quantization, offering balanced performance and precision.
32
+
33
+ These quantized formats ensure flexibility for deployment on various hardware configurations, from lightweight devices to large-scale inference servers.
34
+
35
+ ### Model Information
36
+
37
+ Mixtral 8x7B is a generative Sparse Mixture of Experts (SMoE) model designed to deliver high-quality outputs with significant computational efficiency. Leveraging a routing mechanism, it dynamically activates a subset of experts per input, reducing computational costs while maintaining the performance of a much larger model.
38
+
39
+ **Key Features:**
40
+
41
+ - **Architecture:** Decoder-only SMoE with 46.7B total parameters but only 12.9B parameters active per token.
42
+ - **Context Window:** Supports up to 32k tokens, making it suitable for long-context applications.
43
+ - **Multilingual Capabilities:** Trained on French, Italian, German, Spanish, and English, making it robust for diverse linguistic tasks.
44
+ - **Performance:** Matches or exceeds Llama 2 70B and GPT-3.5 across several industry-standard benchmarks.
45
+ - **Fine-Tuning Potential:** Optimized for instruction-following use cases, with finetuning yielding strong improvements in dialogue and safety alignment.
46
+
47
+ **Developer**: Mistral AI
48
+ **Training Data**: Open web data, curated for quality and diverse representation.
49
+ **Application Areas**: Code generation, multilingual dialogue, and long-context processing.
50
+
51
+ ### Core Library
52
+
53
+ Mixtral 8x7B Instruct can be deployed using `vLLM` or `transformers`. Current support focuses on Hugging Face `transformers` for initial integrations.
54
+
55
+ **Primary Framework**: `transformers`
56
+ **Alternate Framework**: `vLLM` (for specialized inference optimizations)
57
+ **Model Availability**: Source weights and pre-converted formats are available under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
58
+
59
+ ### Safety and Responsible Use
60
+
61
+ Mixtral 8x7B has been trained with an emphasis on ethical use and safety. It includes:
62
+
63
+ 1. **Guardrails for Sensitive Content**: Optional system prompts to guide outputs.
64
+ 2. **Self-Reflection Prompting**: Mechanism for internal assessment of generated outputs, allowing the model to classify its responses as suitable or unsuitable for deployment.
65
+
66
+ Developers should always consider additional tuning or filtering depending on their application and context.