FreshBench / data /model_release_time.csv
jijivski
hover and question _ppl
9882e38
raw
history blame contribute delete
No virus
2.73 kB
Model,Release Date,model,MMLU,GSM8,Humanities,SocialSciences,STEM,Other,Longbench
Baichuan2-13B-Base,2023-08-24,Baichuan2-13B-Base,58.1,52.7,51.54,66.2,47.89,65.24,62.55
Baichuan2-13B-Chat,2023-06-24,Baichuan2-13B-Chat,52.1,55.0,50.71,65.19,47.13,65.01,
Baichuan2-7B-Base,2023-08-24,Baichuan2-7B-Base,54.0,24.4,46.87,58.73,42.63,58.9,16.32
Baichuan2-7B-Chat,2023-08-24,Baichuan2-7B-Chat,52.9,32.0,46.44,58.82,41.93,59.22,32.22
Colossal-LLaMA-2-7b-base,2023-09-24,Colossal-LLaMA-2-7b-base,53.06,9.0,73.1,75.0,34.8,44.0,23.83
HF_RWKV_v5-Eagle-7B,2023-11-15,HF_RWKV_v5-Eagle-7B,33.04,9.3,32.58,34.94,28.29,36.66,19.33
Llama-2-13b-hf,2023-07-18,Llama-2-13b-hf,55.77,22.8,76.0,82.0,28.6,46.4,7.39
Llama-2-7b-hf,2023-07-18,Llama-2-7b-hf,46.87,14.4,70.2,65.0,38.4,42.2,15.29
Qwen-14B-Chat,2023-09-24,Qwen-14B-Chat,66.5,59.0,58.24,74.78,56.87,70.78,38.72
Qwen-1_8B,2023-11-30,Qwen-1_8B,45.3,32.0,40.77,50.93,37.04,51.92,35.4
Qwen-1_8B-Chat,2023-11-30,Qwen-1_8B-Chat,43.99,4.0,39.91,50.08,38.47,49.73,14.39
Qwen-7B,2023-09-24,Qwen-7B,59.84,44.9,75.4,80.0,37.5,48.2,45.53
Qwen-7B-Chat,2023-09-24,Qwen-7B-Chat,57.0,54.0,47.86,64.32,46.91,61.64,33.89
Skywork-13B-base,2023-10-22,Skywork-13B-base,62.1,55.0,56.62,70.13,47.19,67.69,23.48
TinyLlama-1.1B-Chat-v0.6,2023-11-24,TinyLlama-1.1B-Chat-v0.6,25.98,2.1,22.2,28.0,32.1,23.5,5.05
Yi-6B,2024-01-17,Yi-6B,64.11,12.1,83.0,85.0,42.9,45.8,38.89
Yi-6B-Chat,2024-01-17,Yi-6B-Chat,58.24,38.4,55.75,72.41,51.57,69.91,39.54
baichuan-13b-chat,2023-06-24,baichuan-13b-chat,52.1,0.1,43.95,56.48,38.19,56.2,9.29
baichuan-7b-chat,2023-09-24,baichuan-7b-chat,42.8,9.1,40.83,46.96,35.17,47.28,23.3
chatglm3-6b,2023-10-24,chatglm3-6b,61.4,72.0,46.12,59.12,42.82,56.49,
falcon-rw-1b,2023-04-24,falcon-rw-1b,25.28,0.5,29.2,28.0,29.5,28.9,5.31
interlm-20b,2023-09-18,interlm-20b,61.85,23.0,,,,,
internlm-chat-7b,2023-06-06,internlm-chat-7b,50.8,34.0,46.08,58.56,40.28,56.68,9.73
llama2-7b-chat-hf,2023-07-18,llama2-7b-chat-hf,48.32,45.5,72.5,72.0,30.4,43.4,19.58
llama_hf_7b,2023-02-18,llama_hf_7b,46.87,10.0,31.99,31.75,28.35,36.63,4.7
mistral-7b-v0.1,2023-09-18,mistral-7b-v0.1,64.16,37.8,83.0,86.0,48.2,55.4,32.2
opt-13b,2022-05-11,opt-13b,24.9,1.7,31.0,27.0,36.6,25.9,
opt-2.7b,2022-05-11,opt-2.7b,25.43,0.2,26.67,24.63,26.86,24.01,66.67
phi-1_5,2023-08-18,phi-1_5,43.89,12.4,46.8,64.0,38.4,41.0,8.07
phi-2,2023-12-24,phi-2,58.11,54.8,69.0,77.0,49.1,47.6,5.49
pythia-12b,2023-02-24,pythia-12b,26.76,1.7,30.4,29.0,33.0,31.9,
vicuna-7b-v1.5,2023-07-24,vicuna-7b-v1.5,50.82,8.1,71.3,76.0,44.6,42.8,20.39
xverse-13b,2023-08-06,xverse-13b,55.1,18.0,,,,,
zephyr-7b-beta,2023-10-26,zephyr-7b-beta,60.7,11.3,80.7,78.0,34.8,51.2,35.74
zhongjing-base,2023-09-24,zhongjing-base,48.23,26.0,75.4,69.0,39.3,45.2,3.36