Spaces:

openlifescienceai
/

open_medical_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

Offline evaluation

#13

by kaiwang13 - opened May 23

Discussion

kaiwang13

May 23

How to do offline evaluation of this benchmark locally?

kaiwang13 changed discussion status to closed May 23

williamjeong2

May 23

Did you find a way to do offline evaluation?

kaiwang13

May 23

Did you find a way to do offline evaluation?

https://github.com/EleutherAI/lm-evaluation-harness.git is applied in this leaderboard. You can conduct offline evalution with it.

williamjeong2

May 24

@kaiwang13

In the README.md it says to use eval_medical_llm.py, but I didn't find it.
Can you please elaborate a bit on using lm-evaluation-harness to do offline validation?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment