BossRui commited on
Commit
3fb490c
1 Parent(s): 5425610

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -34
README.md CHANGED
@@ -31,7 +31,7 @@ language:
31
  - sq
32
  - da
33
  - sa
34
- - 'no'
35
  - gn
36
  - sr
37
  - sk
@@ -62,9 +62,8 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
62
 
63
 
64
 
65
-
66
  <p align="center">
67
- 📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> • 🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a>
68
  </p>
69
 
70
 
@@ -77,19 +76,32 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
77
  * **[2024.10.15]** ApolloMoE repo is published!🎉
78
 
79
 
 
 
 
 
 
 
 
 
 
 
 
80
  ## Architecture
81
 
82
  <details>
83
  <summary>Click to view the MoE routing image</summary>
84
 
85
- ![ApolloMoE](/assets/hybrid_routing.png)
86
 
87
  </details>
88
 
89
  ## Results
90
 
91
- ### Dense
92
- 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a>
 
 
93
 
94
  <details>
95
  <summary>Click to view the Dense Models Results</summary>
@@ -98,7 +110,8 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
98
 
99
  </details>
100
 
101
- ### Post-MoE
 
102
  🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a>
103
 
104
  <details>
@@ -109,18 +122,15 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
109
  </details>
110
 
111
 
112
-
113
-
114
-
115
-
116
 
117
  ## Usage Format
118
- #### Apollo2
119
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
120
  - 2B, 9B: User:{query}\nAssistant:{response}\<eos\>
121
  - 3.8B: <|user|>\n{query}<|end|><|assisitant|>\n{response}<|end|>
122
 
123
- #### Apollo-MoE
124
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
125
 
126
  ## Dataset & Evaluation
@@ -177,9 +187,7 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
177
  - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part
178
  - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)
179
 
180
-
181
-
182
-
183
 
184
 
185
  </details>
@@ -189,17 +197,17 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
189
  <details><summary>Click to expand</summary>
190
 
191
 
192
- We take Gemma-2b as example
193
  1. Download Dataset for project:
194
 
195
  ```
196
- bash 0.download_data.sh
197
  ```
198
 
199
- 2. Prepare test and dev for specific model:
200
 
201
 
202
- - Create test data for with special token, you can use ./util/check.ipynb to check models' special tokens
203
 
204
  ```
205
  bash 1.data_process_test&dev.sh
@@ -207,23 +215,21 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
207
 
208
  3. Prepare train data for specific model (Create tokenized data in advance):
209
 
210
-
211
- - You can adjust data Training order and Training Epoch in this step
212
 
 
 
213
  ```
214
  bash 2.data_process_train.sh
215
  ```
216
-
217
  4. Train the model
218
 
219
-
220
- - If you want to train in Multi Nodes please refer to ./scripts/multi_node_train_*.sh
221
-
222
-
223
 
224
 
225
  ```
226
- bash 3.single_node_train_gemma.sh
227
  ```
228
 
229
 
@@ -233,12 +239,6 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
233
  bash 4.eval.sh
234
  ```
235
 
236
- 6. Evaluate your model: Play with your ckpts in bash
237
-
238
- ```
239
- python ./src/evaluate/cli_demo.py --model_name='./ckpts/your/path/tfmr'
240
- ```
241
-
242
  </details>
243
 
244
 
 
31
  - sq
32
  - da
33
  - sa
34
+ - no
35
  - gn
36
  - sr
37
  - sk
 
62
 
63
 
64
 
 
65
  <p align="center">
66
+ 📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> •🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a>
67
  </p>
68
 
69
 
 
76
  * **[2024.10.15]** ApolloMoE repo is published!🎉
77
 
78
 
79
+ ## Languages Coverage
80
+ 12 Major Languages and 38 Minor Languages
81
+
82
+ <details>
83
+ <summary>Click to view the Languages Coverage</summary>
84
+
85
+ ![ApolloMoE](assets/languages.png)
86
+
87
+ </details>
88
+
89
+
90
  ## Architecture
91
 
92
  <details>
93
  <summary>Click to view the MoE routing image</summary>
94
 
95
+ ![ApolloMoE](assets/hybrid_routing.png)
96
 
97
  </details>
98
 
99
  ## Results
100
 
101
+ #### Dense
102
+ 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a>
103
+
104
+ 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a>
105
 
106
  <details>
107
  <summary>Click to view the Dense Models Results</summary>
 
110
 
111
  </details>
112
 
113
+
114
+ #### Post-MoE
115
  🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a>
116
 
117
  <details>
 
122
  </details>
123
 
124
 
125
+
 
 
 
126
 
127
  ## Usage Format
128
+ ##### Apollo2
129
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
130
  - 2B, 9B: User:{query}\nAssistant:{response}\<eos\>
131
  - 3.8B: <|user|>\n{query}<|end|><|assisitant|>\n{response}<|end|>
132
 
133
+ ##### Apollo-MoE
134
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
135
 
136
  ## Dataset & Evaluation
 
187
  - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part
188
  - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)
189
 
190
+
 
 
191
 
192
 
193
  </details>
 
197
  <details><summary>Click to expand</summary>
198
 
199
 
200
+ We take Apollo2-7B or Apollo-MoE-0.5B as example
201
  1. Download Dataset for project:
202
 
203
  ```
204
+ bash 0.download_data.sh 
205
  ```
206
 
207
+ 2. Prepare test and dev data for specific model:
208
 
209
 
210
+ - Create test data for with special token
211
 
212
  ```
213
  bash 1.data_process_test&dev.sh
 
215
 
216
  3. Prepare train data for specific model (Create tokenized data in advance):
217
 
 
 
218
 
219
+ - You can adjust data Training order and Training Epoch in this step
220
+
221
  ```
222
  bash 2.data_process_train.sh
223
  ```
224
+
225
  4. Train the model
226
 
227
+
228
+ - If you want to train in Multi Nodes please refer to ./src/sft/training_config/zero_multi.yaml
 
 
229
 
230
 
231
  ```
232
+ bash 3.single_node_train.sh
233
  ```
234
 
235
 
 
239
  bash 4.eval.sh
240
  ```
241
 
 
 
 
 
 
 
242
  </details>
243
 
244