CobraMamba
commited on
Commit
•
900f740
1
Parent(s):
be03b95
Update README.md
Browse files
README.md
CHANGED
@@ -31,3 +31,18 @@ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/E
|
|
31 |
The training code and data will be open sourced later on Github(https://github.com/chi2liu/mamba-gpt-3b)
|
32 |
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
The training code and data will be open sourced later on Github(https://github.com/chi2liu/mamba-gpt-3b)
|
32 |
|
33 |
|
34 |
+
## Training Dataset
|
35 |
+
|
36 |
+
` mamba-gpt-3b-v4 ` is trained on multiply dataset:
|
37 |
+
- [Stanford Alpaca (en)](https://github.com/tatsu-lab/stanford_alpaca)
|
38 |
+
- [Open Assistant (multilingual)](https://huggingface.co/datasets/OpenAssistant/oasst1)
|
39 |
+
- [LIMA (en)](https://huggingface.co/datasets/GAIR/lima)
|
40 |
+
- [CodeAlpaca 20k (en)](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
|
41 |
+
- [GPT-4 Generated Data (en&zh)](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)
|
42 |
+
- [UltraChat (en)](https://github.com/thunlp/UltraChat)
|
43 |
+
|
44 |
+
|
45 |
+
## Summary
|
46 |
+
|
47 |
+
We have fine-tuned the open-lama model and surpassed the original model in multiple evaluation subtasks, making it currently the best performing 3B model with comparable performance to llama-7b
|
48 |
+
- Base model: [openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2)
|