BossRui commited on
Commit
ab610b1
1 Parent(s): c5b2c48

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -36
README.md CHANGED
@@ -31,7 +31,7 @@ language:
31
  - sq
32
  - da
33
  - sa
34
- - 'no'
35
  - gn
36
  - sr
37
  - sk
@@ -63,7 +63,7 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
63
 
64
 
65
  <p align="center">
66
- 📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> • 🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a>
67
  </p>
68
 
69
 
@@ -76,50 +76,61 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
76
  * **[2024.10.15]** ApolloMoE repo is published!🎉
77
 
78
 
 
 
 
 
 
 
 
 
 
 
 
79
  ## Architecture
80
 
81
  <details>
82
  <summary>Click to view the MoE routing image</summary>
83
 
84
- ![ApolloMoE](/assets/hybrid_routing.png)
85
 
86
  </details>
87
 
88
  ## Results
89
 
90
- ### Dense
91
- 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a>
92
 
 
 
93
  <details>
94
  <summary>Click to view the Dense Models Results</summary>
95
-
96
  ![ApolloMoE](assets/dense_results.png)
97
 
98
  </details>
99
 
100
- ### Post-MoE
 
101
  🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a>
102
-
103
  <details>
104
  <summary>Click to view the Post-MoE Models Results</summary>
105
-
106
  ![ApolloMoE](assets/post_moe_results.png)
107
 
108
  </details>
109
-
110
-
111
-
112
 
113
-
114
 
 
115
 
116
  ## Usage Format
117
- #### Apollo2
118
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
119
  - 2B, 9B: User:{query}\nAssistant:{response}\<eos\>
120
  - 3.8B: <|user|>\n{query}<|end|><|assisitant|>\n{response}<|end|>
121
 
122
- #### Apollo-MoE
123
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
124
 
125
  ## Dataset & Evaluation
@@ -135,12 +146,12 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
135
 
136
 
137
  </details>
138
-
139
  - Evaluation
140
  🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a>
141
 
142
  <details><summary>Click to expand</summary>
143
-
144
  - EN:
145
  - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
146
  - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test)
@@ -176,28 +187,27 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
176
  - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part
177
  - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)
178
 
179
-
180
-
181
 
182
 
183
  </details>
184
 
185
-
186
  ## Results reproduction
187
  <details><summary>Click to expand</summary>
188
 
189
-
190
- We take Gemma-2b as example
191
  1. Download Dataset for project:
192
 
193
  ```
194
- bash 0.download_data.sh
195
  ```
196
 
197
- 2. Prepare test and dev for specific model:
198
 
199
 
200
- - Create test data for with special token, you can use ./util/check.ipynb to check models' special tokens
201
 
202
  ```
203
  bash 1.data_process_test&dev.sh
@@ -215,13 +225,11 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
215
  4. Train the model
216
 
217
 
218
- - If you want to train in Multi Nodes please refer to ./scripts/multi_node_train_*.sh
219
-
220
-
221
 
222
 
223
  ```
224
- bash 3.single_node_train_gemma.sh
225
  ```
226
 
227
 
@@ -231,12 +239,6 @@ Covering 12 Major Languages including English, Chinese, French, Hindi, Spanish,
231
  bash 4.eval.sh
232
  ```
233
 
234
- 6. Evaluate your model: Play with your ckpts in bash
235
-
236
- ```
237
- python ./src/evaluate/cli_demo.py --model_name='./ckpts/your/path/tfmr'
238
- ```
239
-
240
  </details>
241
 
242
 
@@ -255,4 +257,3 @@ Please use the following citation if you intend to use our dataset for training
255
  url={https://arxiv.org/abs/2410.10626},
256
  }
257
  ```
258
-
 
31
  - sq
32
  - da
33
  - sa
34
+ - no
35
  - gn
36
  - sr
37
  - sk
 
63
 
64
 
65
  <p align="center">
66
+ 📃 <a href="https://arxiv.org/abs/2410.10626" target="_blank">Paper</a> • 🌐 <a href="" target="_blank">Demo</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEDataset" target="_blank">ApolloMoEDataset</a> • 🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a> • 🤗 <a href="https://huggingface.co/collections/FreedomIntelligence/apollomoe-and-apollo2-670ddebe3bb1ba1aebabbf2c" target="_blank">Models</a> •🌐 <a href="https://github.com/FreedomIntelligence/Apollo" target="_blank">Apollo</a> • 🌐 <a href="https://github.com/FreedomIntelligence/ApolloMoE" target="_blank">ApolloMoE</a>
67
  </p>
68
 
69
 
 
76
  * **[2024.10.15]** ApolloMoE repo is published!🎉
77
 
78
 
79
+ ## Languages Coverage
80
+ 12 Major Languages and 38 Minor Languages
81
+
82
+ <details>
83
+ <summary>Click to view the Languages Coverage</summary>
84
+
85
+ ![ApolloMoE](assets/languages.png)
86
+
87
+ </details>
88
+
89
+
90
  ## Architecture
91
 
92
  <details>
93
  <summary>Click to view the MoE routing image</summary>
94
 
95
+ ![ApolloMoE](assets/hybrid_routing.png)
96
 
97
  </details>
98
 
99
  ## Results
100
 
101
+ #### Dense
102
+ 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-0.5B" target="_blank">Apollo2-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-1.5B" target="_blank">Apollo2-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-2B" target="_blank">Apollo2-2B</a>
103
 
104
+ 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-3.8B" target="_blank">Apollo2-3.8B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-7B" target="_blank">Apollo2-7B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo2-9B" target="_blank">Apollo2-9B</a>
105
+
106
  <details>
107
  <summary>Click to view the Dense Models Results</summary>
108
+
109
  ![ApolloMoE](assets/dense_results.png)
110
 
111
  </details>
112
 
113
+
114
+ #### Post-MoE
115
  🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-0.5B" target="_blank">Apollo-MoE-0.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-1.5B" target="_blank">Apollo-MoE-1.5B</a> • 🤗 <a href="https://huggingface.co/FreedomIntelligence/Apollo-MoE-7B" target="_blank">Apollo-MoE-7B</a>
116
+
117
  <details>
118
  <summary>Click to view the Post-MoE Models Results</summary>
119
+
120
  ![ApolloMoE](assets/post_moe_results.png)
121
 
122
  </details>
 
 
 
123
 
 
124
 
125
+
126
 
127
  ## Usage Format
128
+ ##### Apollo2
129
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
130
  - 2B, 9B: User:{query}\nAssistant:{response}\<eos\>
131
  - 3.8B: <|user|>\n{query}<|end|><|assisitant|>\n{response}<|end|>
132
 
133
+ ##### Apollo-MoE
134
  - 0.5B, 1.5B, 7B: User:{query}\nAssistant:{response}<|endoftext|>
135
 
136
  ## Dataset & Evaluation
 
146
 
147
 
148
  </details>
149
+
150
  - Evaluation
151
  🤗 <a href="https://huggingface.co/datasets/FreedomIntelligence/ApolloMoEBench" target="_blank">ApolloMoEBench</a>
152
 
153
  <details><summary>Click to expand</summary>
154
+
155
  - EN:
156
  - [MedQA-USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
157
  - [MedMCQA](https://huggingface.co/datasets/medmcqa/viewer/default/test)
 
187
  - PT: [BioInstructQA](https://huggingface.co/datasets/BioMistral/BioInstructQA): Portuguese part
188
  - RU: [RuMedBench](https://github.com/sb-ai-lab/MedBench)
189
 
190
+
 
191
 
192
 
193
  </details>
194
 
195
+
196
  ## Results reproduction
197
  <details><summary>Click to expand</summary>
198
 
199
+
200
+ We take Apollo2-7B or Apollo-MoE-0.5B as example
201
  1. Download Dataset for project:
202
 
203
  ```
204
+ bash 0.download_data.sh 
205
  ```
206
 
207
+ 2. Prepare test and dev data for specific model:
208
 
209
 
210
+ - Create test data for with special token
211
 
212
  ```
213
  bash 1.data_process_test&dev.sh
 
225
  4. Train the model
226
 
227
 
228
+ - If you want to train in Multi Nodes please refer to ./src/sft/training_config/zero_multi.yaml
 
 
229
 
230
 
231
  ```
232
+ bash 3.single_node_train.sh
233
  ```
234
 
235
 
 
239
  bash 4.eval.sh
240
  ```
241
 
 
 
 
 
 
 
242
  </details>
243
 
244
 
 
257
  url={https://arxiv.org/abs/2410.10626},
258
  }
259
  ```