Alina Lozovskaya

#851 opened 3 days ago by

nlpguy

Is there an issue with adding bos in the new evaluation?

#852 opened 3 days ago by

lingyun1

Model deleted from Pending

6

#850 opened 3 days ago by

dnhkng

New activity in open-llm-leaderboard/open_llm_leaderboard 3 days ago

dolphin-2.9.2-qwen2-72b failed, check logs

#824 opened 17 days ago by

CombinHorizon

70B models FAILED

7

#830 opened 11 days ago by

MaziyarPanahi

Latest results from eval runs are not updated on Leaderboard/Content repo

#847 opened 4 days ago by

pankajmathur

[BUG] Gemma2-9b-it evaluation

#849 opened 4 days ago by

DeepMount00

New activity in open-llm-leaderboard/open_llm_leaderboard 4 days ago

model evaluation results not updated on the leaderboard

#846 opened 4 days ago by

Azure99

New activity in open-llm-leaderboard/open_llm_leaderboard 5 days ago

Wrong results or am i understanding something wrong?

8

#839 opened 9 days ago by

nicobuko

State of Open LLM Leaderboard v2 evals and Reproduciblity Issues.

8

#829 opened 12 days ago by

pankajmathur

Submitted models aren't showing up

#835 opened 10 days ago by

'Running' from the first day of the new leaderboard to pending and not showing anymore

#845 opened 6 days ago by

DavidGF

The problem about the overall score of BBH and GPQA datasets

#842 opened 7 days ago by

Amigozyq

New activity in open-llm-leaderboard/open_llm_leaderboard 6 days ago

submission-system-update

#844 opened 6 days ago by

Model not on pending for evaluation

#841 opened 7 days ago by

acbdkk

Gemma-2-9B-it scores

#843 opened 6 days ago by

saishf

WizardLM-8x22B Evaluation failed

25

#823 opened 17 days ago by

llama-anon

New activity in open-llm-leaderboard/open_llm_leaderboard 9 days ago

submit-system-update

12

#838 opened 9 days ago by

bump-up-gradio_leaderboard

6

#836 opened 9 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 10 days ago

v2 voting

23

#831 opened 11 days ago by

lucyknada

New activity in open-llm-leaderboard/open_llm_leaderboard 11 days ago

Feature Request for Leaderboard: date added to hub

8

#425 opened 8 months ago by

madmaxbr5

Model Eval Failed: Tess-v2.5.2-Qwen2-72B

#826 opened 13 days ago by

migtissera

New activity in open-llm-leaderboard/open_llm_leaderboard 12 days ago

The results of BBH are inconsistant with official result of Qwen2

#827 opened 13 days ago by

peels7877

New activity in open-llm-leaderboard/open_llm_leaderboard 13 days ago

Raw results to normalized results

#825 opened 13 days ago by

Ilyasch2

I am getting this Base model "mistralai/Mistral-7B-Instruct-v0.2" was not found or misconfigured on the hub!

#819 opened 19 days ago by

rootxhacker

RecurrentGemma - add the rest of the models!

#800 opened 25 days ago by

devingulliver

New activity in open-llm-leaderboard/open_llm_leaderboard 17 days ago

Average column values

5

#821 opened 18 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 19 days ago

It seems that PHI-3 is the best...

#816 opened 19 days ago by

ZeroWw

Archive of the last leaderboard

5

#807 opened 23 days ago by

MarxistLeninist

Some models are tagged with incorrect model types

#806 opened 23 days ago by

scinerd68

Model glm-4-9b-chat 128K and 1M are missing

#817 opened 19 days ago by

ZeroWw

Failed evaluation for Miqu-70B

#812 opened 23 days ago by

llama-anon

cannot load results

#811 opened 23 days ago by

ysharma1126

Leaderboard data

#813 opened 21 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 23 days ago

fix-merged-column

#810 opened 23 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 24 days ago

submission-fix

19

#803 opened 24 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard about 2 months ago

No good way to identify number of activated parameters causes MIxtral evaluation failures

32

#680 opened 3 months ago by

0-hero

70B models failed

#756 opened about 2 months ago by

MaziyarPanahi

Model Submission Finished but Not Listed in Results

7

#747 opened 2 months ago by

Stefan171

New activity in open-llm-leaderboard-old/requests about 2 months ago

Eval Failed

#146 opened about 2 months ago by

ajibawa-2023

New activity in open-llm-leaderboard-old/details_Ramikan-BR__tinyllama_PY-CODER-4bit-lora_4k-v12 about 2 months ago

Create README.md

#1 opened about 2 months ago by

Ramikan-BR

New activity in open-llm-leaderboard-old/requests about 2 months ago

Failed eval

6

#125 opened 3 months ago by

KnutJaegersberg

New activity in open-llm-leaderboard/open_llm_leaderboard about 2 months ago

Leaderboard stuck?

#754 opened about 2 months ago by

DreamGenX

Update LB to latest transformers

#751 opened 2 months ago by

MaziyarPanahi

bump-transformers-to-4.41.1

#753 opened about 2 months ago by

New activity in open-llm-leaderboard-old/requests 2 months ago

DBRX-Instruct evaluation failed, likely due to model size (132B params)

#121 opened 3 months ago by

abhi-db

New activity in open-llm-leaderboard/open_llm_leaderboard 2 months ago

apply-ruff

7

#748 opened 2 months ago by

Feature Request: Multilingual Evaluations 🌐

#745 opened 2 months ago by

eliot-christon

Models that used Nectar dataset

13

#749 opened 2 months ago by