Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

854

Some suggestions for evaluation priority voting mechanism

#801

by zhiminy - opened 27 days ago

Discussion

zhiminy

27 days ago

•

edited 26 days ago

Introducing a community-driven voting system to prioritize model evaluations is an innovative approach to managing resource constraints and budgets effectively :) Thanks for your efforts!

However, without a mechanism to periodically increase the priority of less popular models, there is a risk that some models might never be evaluated, especially considering the high volume of daily submissions.

clefourrier

Open LLM Leaderboard org 27 days ago

Hi!
Thanks for your interest in the leaderboard!

This is precisely the point, though - we are compute constrained and needed to find a fair way to evaluate first the models most relevant for the community, so some models might indeed be evaluated much later than others if they are less important.
The model dropdown can act as a search bar, so I'm not sure what else you would want, can you specify?

BarraHome

26 days ago

Hi!
Thanks for your interest in the leaderboard!

This is precisely the point, though - we are compute constrained and needed to find a fair way to evaluate first the models most relevant for the community, so some models might indeed be evaluated much later than others if they are less important.

The model dropdown can act as a search bar, so I'm not sure what else you would want, can you specify?

Hey @clefourrier
First of all, thank you for your hard work.

I understand the constraints and the need to prioritize models that are most relevant to the community. I appreciate your efforts to ensure fairness in the evaluation process.

I have a suggestion that might help streamline things: would it be possible to offer a paid option for model evaluations? This way, those who are less concerned with votes or popularity and more eager to get their models evaluated quickly could opt for this route.

It could also help support the resources needed for running the evaluations.

zhiminy

26 days ago

Hi!
Thanks for your interest in the leaderboard!

This is precisely the point, though - we are compute constrained and needed to find a fair way to evaluate first the models most relevant for the community, so some models might indeed be evaluated much later than others if they are less important.

The model dropdown can act as a search bar, so I'm not sure what else you would want, can you specify?

Hey @clefourrier
First of all, thank you for your hard work.

I understand the constraints and the need to prioritize models that are most relevant to the community. I appreciate your efforts to ensure fairness in the evaluation process.

I have a suggestion that might help streamline things: would it be possible to offer a paid option for model evaluations? This way, those who are less concerned with votes or popularity and more eager to get their models evaluated quickly could opt for this route.

It could also help support the resources needed for running the evaluations.

Either highly voted or paid models are prioritized for evaluation, this is quite a brilliant idea tbh.

clefourrier

Open LLM Leaderboard org 26 days ago

At the moment, we have no way to do that easily, but we've been thinking about something using user tokens + inference endpoints.
However, to give you an order of magnitude, evaluating a 7B takes at the moment 2 to 3h on 8H100-80G GPUs, for a 70B it's an order of magnitude more. It would be quite a budget ^^

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment