MMLU-Pro benchmark?

#25

by lightenup - opened 6 days ago

6 days ago

Are there somewhere MMLU-Pro benchmark results including the numbers for the individual categories? I'd be interested in a benchmark of the officially released model (not quantized).

ubowang

1 day ago

Feel free to check out the performance of gemma-2-9b and gemma-2-9b-it on MMLU-Pro at https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro.

lightenup

about 20 hours ago

•

edited about 20 hours ago

Thanks!

The prompt itself and all other inference parameters are very important for the MMLU-Pro score. Hence it would be great, if we somewhere see what prompt and inference parameters were used.

Also I am even more eager to see the gemma-2-27b MMLU-Pro results, as the gemma-2-9b already scores so high!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment