CobraMamba
commited on
Commit
•
be03b95
1
Parent(s):
4666fbc
Update README.md
Browse files
README.md
CHANGED
@@ -1,4 +1,33 @@
|
|
1 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
---
|
4 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
library_name: transformers
|
5 |
+
tags:
|
6 |
+
- gpt
|
7 |
+
- llm
|
8 |
+
- large language model
|
9 |
+
inference: false
|
10 |
+
thumbnail: >-
|
11 |
+
https://h2o.ai/etc.clientlibs/h2o/clientlibs/clientlib-site/resources/images/favicon.ico
|
12 |
license: apache-2.0
|
13 |
---
|
14 |
+
# Model Card
|
15 |
+
|
16 |
+
**The Best 3B Model! Surpassing dolly-v2-12b**
|
17 |
+
|
18 |
+
The best 3B model on MMLU (5-shot) on the [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), with performance surpassing dolly-v2-12b
|
19 |
+
|
20 |
+
| Metric | Value |
|
21 |
+
|-----------------------|-------|
|
22 |
+
| MMLU (5-shot) | 30.0 |
|
23 |
+
| ARC (25-shot) | 42.6 |
|
24 |
+
| HellaSwag (10-shot) | 71.0 |
|
25 |
+
| TruthfulQA (0-shot) | 37.3 |
|
26 |
+
| Avg. | 45.2 |
|
27 |
+
|
28 |
+
We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above.
|
29 |
+
|
30 |
+
|
31 |
+
The training code and data will be open sourced later on Github(https://github.com/chi2liu/mamba-gpt-3b)
|
32 |
+
|
33 |
+
|