Offline evaluation

#13
by kaiwang13 - opened

How to do offline evaluation of this benchmark locally?

kaiwang13 changed discussion status to closed

Did you find a way to do offline evaluation?

Did you find a way to do offline evaluation?

https://github.com/EleutherAI/lm-evaluation-harness.git is applied in this leaderboard. You can conduct offline evalution with it.

@kaiwang13

In the README.md it says to use eval_medical_llm.py, but I didn't find it.
Can you please elaborate a bit on using lm-evaluation-harness to do offline validation?

Sign up or log in to comment