CobraMamba commited on
Commit
900f740
1 Parent(s): be03b95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -31,3 +31,18 @@ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/E
31
  The training code and data will be open sourced later on Github(https://github.com/chi2liu/mamba-gpt-3b)
32
 
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31
  The training code and data will be open sourced later on Github(https://github.com/chi2liu/mamba-gpt-3b)
32
 
33
 
34
+ ## Training Dataset
35
+
36
+ ` mamba-gpt-3b-v4 ` is trained on multiply dataset:
37
+ - [Stanford Alpaca (en)](https://github.com/tatsu-lab/stanford_alpaca)
38
+ - [Open Assistant (multilingual)](https://huggingface.co/datasets/OpenAssistant/oasst1)
39
+ - [LIMA (en)](https://huggingface.co/datasets/GAIR/lima)
40
+ - [CodeAlpaca 20k (en)](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)
41
+ - [GPT-4 Generated Data (en&zh)](https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM)
42
+ - [UltraChat (en)](https://github.com/thunlp/UltraChat)
43
+
44
+
45
+ ## Summary
46
+
47
+ We have fine-tuned the open-lama model and surpassed the original model in multiple evaluation subtasks, making it currently the best performing 3B model with comparable performance to llama-7b
48
+ - Base model: [openlm-research/open_llama_3b_v2](https://huggingface.co/openlm-research/open_llama_3b_v2)