chiliu commited on
Commit
89c2cad
1 Parent(s): f61111b
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -177,14 +177,14 @@ The original LLaMA model was trained for 1 trillion tokens and GPT-J was trained
177
 
178
  | **Task/Metric** | Mamba-GPT 3B | LLaMA 7B | OpenLLaMA 7B | OpenLLaMA 3B | OpenLLaMA 13B 600BT |
179
  | ---------------------- | -------- | -------- | ------------ | ------------ | ------------------- |
180
- | anli_r1/acc | 0.35 | 0.35 | 0.33 | 0.33 | 0.33 |
181
  | anli_r2/acc | 0.33 | 0.34 | 0.36 | 0.32 | 0.35 |
182
  | anli_r3/acc | 0.35 | 0.37 | 0.38 | 0.35 | 0.38 |
183
  | arc_challenge/acc | 0.35 | 0.39 | 0.37 | 0.34 | 0.39 |
184
  | arc_challenge/acc_norm | 0.37 | 0.41 | 0.38 | 0.37 | 0.42 |
185
  | arc_easy/acc | 0.71 | 0.68 | 0.72 | 0.69 | 0.74 |
186
  | arc_easy/acc_norm | 0.65 | 0.52 | 0.68 | 0.65 | 0.70 |
187
- | boolq/acc | 0.72 | 0.56 | 0.53 | 0.66 | 0.71 |
188
  | hellaswag/acc | 0.49 | 0.36 | 0.63 | 0.43 | 0.54 |
189
  | hellaswag/acc_norm | 0.66 | 0.73 | 0.72 | 0.67 | 0.73 |
190
  | openbookqa/acc | 0.26 | 0.29 | 0.30 | 0.27 | 0.30 |
@@ -194,8 +194,8 @@ The original LLaMA model was trained for 1 trillion tokens and GPT-J was trained
194
  | record/em | 0.88 | 0.91 | 0.89 | 0.88 | 0.90 |
195
  | record/f1 | 0.88 | 0.91 | 0.90 | 0.89 | 0.90 |
196
  | rte/acc | 0.55 | 0.56 | 0.60 | 0.58 | 0.65 |
197
- | truthfulqa_mc/mc1 | 0.27 | 0.21 | 0.23 | 0.22 | 0.22 |
198
- | truthfulqa_mc/mc2 | 0.37 | 0.34 | 0.35 | 0.35 | 0.35 |
199
  | wic/acc | 0.49 | 0.50 | 0.51 | 0.48 | 0.49 |
200
  | winogrande/acc | 0.63 | 0.68 | 0.67 | 0.62 | 0.67 |
201
  | Average | 0.53 | 0.53 | 0.55 | 0.52 | 0.56 |
 
177
 
178
  | **Task/Metric** | Mamba-GPT 3B | LLaMA 7B | OpenLLaMA 7B | OpenLLaMA 3B | OpenLLaMA 13B 600BT |
179
  | ---------------------- | -------- | -------- | ------------ | ------------ | ------------------- |
180
+ | anli_r1/acc | **0.35** | 0.35 | 0.33 | 0.33 | 0.33 |
181
  | anli_r2/acc | 0.33 | 0.34 | 0.36 | 0.32 | 0.35 |
182
  | anli_r3/acc | 0.35 | 0.37 | 0.38 | 0.35 | 0.38 |
183
  | arc_challenge/acc | 0.35 | 0.39 | 0.37 | 0.34 | 0.39 |
184
  | arc_challenge/acc_norm | 0.37 | 0.41 | 0.38 | 0.37 | 0.42 |
185
  | arc_easy/acc | 0.71 | 0.68 | 0.72 | 0.69 | 0.74 |
186
  | arc_easy/acc_norm | 0.65 | 0.52 | 0.68 | 0.65 | 0.70 |
187
+ | boolq/acc | **0.72** | 0.56 | 0.53 | 0.66 | 0.71 |
188
  | hellaswag/acc | 0.49 | 0.36 | 0.63 | 0.43 | 0.54 |
189
  | hellaswag/acc_norm | 0.66 | 0.73 | 0.72 | 0.67 | 0.73 |
190
  | openbookqa/acc | 0.26 | 0.29 | 0.30 | 0.27 | 0.30 |
 
194
  | record/em | 0.88 | 0.91 | 0.89 | 0.88 | 0.90 |
195
  | record/f1 | 0.88 | 0.91 | 0.90 | 0.89 | 0.90 |
196
  | rte/acc | 0.55 | 0.56 | 0.60 | 0.58 | 0.65 |
197
+ | truthfulqa_mc/mc1 | **0.27** | 0.21 | 0.23 | 0.22 | 0.22 |
198
+ | truthfulqa_mc/mc2 | **0.37** | 0.34 | 0.35 | 0.35 | 0.35 |
199
  | wic/acc | 0.49 | 0.50 | 0.51 | 0.48 | 0.49 |
200
  | winogrande/acc | 0.63 | 0.68 | 0.67 | 0.62 | 0.67 |
201
  | Average | 0.53 | 0.53 | 0.55 | 0.52 | 0.56 |