flydust commited on
Commit
5ee112a
β€’
1 Parent(s): e2744d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -25,7 +25,7 @@ model-index:
25
 
26
  *Model full name: Llama3.1-MagpieLM-4B-Chat-v0.1*
27
 
28
- This model is an aligned version of [Llama-3.1-Minitron-4B-Width](https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base), which achieves state-of-the-art performance among open-aligned SLMs. It even outperforms larger open-weight models including Llama-3-8B-Instruct, Llama-3.1-8B-Instruct and Qwen-2-7B-Instruct.
29
 
30
  We apply the following standard alignment pipeline with two carefully crafted synthetic datasets. Feel free to use these datasets and reproduce our model, or make your own friendly chatbots :)
31
 
@@ -34,6 +34,8 @@ We first perform SFT using [Magpie-Align/MagpieLM-SFT-Data-v0.1](https://hugging
34
 
35
  We then perform DPO on the [Magpie-Align/MagpieLM-DPO-Data-v0.1](https://huggingface.co/datasets/Magpie-Align/MagpieLM-DPO-Data-v0.1) dataset.
36
 
 
 
37
  ## πŸ”₯ Benchmark Performance
38
 
39
  Greedy Decoding
@@ -44,16 +46,20 @@ Greedy Decoding
44
 
45
  **Benchmark Performance Compare to Other SOTA SLMs**
46
 
47
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/653df1323479e9ebbe3eb6cc/lMZ9M2h_9fJsjrw0BmPVD.png)
48
 
49
  ## πŸ‘€ Other Information
50
 
51
  **License**: Please follow [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
52
 
53
- **Conversation Template**: Please use the Llama 3 chat template for the best performance.
 
 
54
 
55
  ## 🧐 How to use it?
56
 
 
 
57
  Please update transformers to the latest version by `pip install git+https://github.com/huggingface/transformers`.
58
 
59
  You can then run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
@@ -161,12 +167,11 @@ special_tokens:
161
  pad_token: <|end_of_text|>
162
 
163
  ```
164
-
165
  </details><br>
166
 
167
  ## Stage 2: Direct Preference Optimization
168
 
169
- ## Training procedure
170
 
171
  ### Training hyperparameters
172
 
 
25
 
26
  *Model full name: Llama3.1-MagpieLM-4B-Chat-v0.1*
27
 
28
+ This model is an aligned version of [Llama-3.1-Minitron-4B-Width](https://huggingface.co/nvidia/Llama-3.1-Minitron-4B-Width-Base), which achieves state-of-the-art performance among open-aligned SLMs. It even outperforms larger open-weight models including Llama-3-8B-Instruct, Llama-3.1-8B-Instruct and Qwen-2-7B-Instruct.
29
 
30
  We apply the following standard alignment pipeline with two carefully crafted synthetic datasets. Feel free to use these datasets and reproduce our model, or make your own friendly chatbots :)
31
 
 
34
 
35
  We then perform DPO on the [Magpie-Align/MagpieLM-DPO-Data-v0.1](https://huggingface.co/datasets/Magpie-Align/MagpieLM-DPO-Data-v0.1) dataset.
36
 
37
+ [*See more powerful 8B version here!*](https://huggingface.co/Magpie-Align/MagpieLM-8B-Chat-v0.1)
38
+
39
  ## πŸ”₯ Benchmark Performance
40
 
41
  Greedy Decoding
 
46
 
47
  **Benchmark Performance Compare to Other SOTA SLMs**
48
 
49
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/653df1323479e9ebbe3eb6cc/cNigvzqznKWRy1YfktZ6J.jpeg)
50
 
51
  ## πŸ‘€ Other Information
52
 
53
  **License**: Please follow [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
54
 
55
+ **Conversation Template**: Please use the **Llama 3 chat template** for the best performance.
56
+
57
+ **Limitations**: This model primarily understands and generates content in English. Its outputs may contain factual errors, logical inconsistencies, or reflect biases present in the training data. While the model aims to improve instruction-following and helpfulness, it isn't specifically designed for complex reasoning tasks, potentially leading to suboptimal performance in these areas. Additionally, the model may produce unsafe or inappropriate content, as no specific safety training were implemented during the alignment process.
58
 
59
  ## 🧐 How to use it?
60
 
61
+ [![Spaces](https://img.shields.io/badge/πŸ€—-Open%20in%20Spaces-blue)](https://huggingface.co/spaces/flydust/MagpieLM-4B)
62
+
63
  Please update transformers to the latest version by `pip install git+https://github.com/huggingface/transformers`.
64
 
65
  You can then run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
 
167
  pad_token: <|end_of_text|>
168
 
169
  ```
 
170
  </details><br>
171
 
172
  ## Stage 2: Direct Preference Optimization
173
 
174
+ We use [alignment handbook](https://github.com/huggingface/alignment-handbook) for DPO.
175
 
176
  ### Training hyperparameters
177