Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

857

WizardLM-8x22B Evaluation failed

#823

by llama-anon - opened 20 days ago

Discussion

llama-anon

20 days ago

https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/alpindale/WizardLM-2-8x22B_eval_request_False_float16_Original.json

alozowski

Open LLM Leaderboard org 19 days ago

•

edited 19 days ago

Hi @llama-anon ,

Thanks for providing the request file!
Currently, our cluster is quite full, so we're only evaluating models that can run on a single node. This approach helps us evaluate more models concurrently. However, if there's enough interest from the community, we're open to manually evaluating models that require more than one node, like this one you've submitted

bullerwins

18 days ago

Another interested user here. Wizard 8x22 is my current go to open source model I use for pretty much everything

placebomancer

18 days ago

Definitely interested in seeing how Wizardlm-2 8x22b stacks up. It seems vastly better than the other fine-tunes of 8x22b, including Mistral's own. I think the only reason it hasn't gotten more attention is that it was never put on LMSYS Arena. It's been in the top ten most used models on OpenRouter for awhile now and I think it would have a solid chance of topping the leaderboard.

smcleod

18 days ago

Would be good to get this added, it's been out quite some time but people really rave about it.

isr431

18 days ago

Very capable finetune by Microsoft, would love to it added (and potentially the 7b variant)!

freegheist

18 days ago

Strongest FOSS model/finetune, except for coding. Crazy this isn't on the leaderboard. Yes it needs a lot of VRAM. but it would be really good to showcase the best of open source IMO.

SnailInf1

18 days ago

WizardLM-2-8x22b is one of the most powerful open-source language models. It would be really great to see how it performs compared to other open-source large language models on the Open-LLM-Leaderboard.

sloopbun

18 days ago

Voting WizardLM-2-8x22b

Duckycode

18 days ago

Would love to see Wizard ranked! It’d be really good to see how it compares to other wizard and non-wizard models.

Novetteus

18 days ago

•

edited 18 days ago

Would also love to see it ranked. Was a fantastic model when I tried online hostings of it. Still worthwhile on the lobotomized local usage my setup can get out of it, which was a pleasant surprise.

SomeOddCodeGuy

18 days ago

I definitely have an interest in seeing the Wizard benchmarks. This topic has come up a few times on LocalLlama, but none of us have really known how to get it up here and just assumed it wouldn't happen.

I think you'd make a few people pretty happy if you were able to squeeze this one in.

OpenLeecher

18 days ago

Wiz 8x22 5bpw is still my daily driver. It's writing contextual awareness and fringe knowledge is still unmatched IMO. Would love to see how it stacks up against the other top dogs.

ricced

18 days ago

Voting WizardLM-2-8x22b

stmacdonell

18 days ago

Add my vote!

MrVodnik

18 days ago

Voting WizardLM-2-8x22b

qwp4w3hyb

18 days ago

Vote from me as well !

mahirzukic2

18 days ago

I would like to see it too.

pszemraj

18 days ago

•

edited 18 days ago

jsfs11

17 days ago

Very interested to see it Benchmarked also +1

nichedreams

17 days ago

Another vote from me

Tom-Neverwinter

17 days ago

Id rather we run the highest quality models get the baseline going then proceed to quantity as the goal is to top score as soon as possible so we stop the plateau

get wizardlm in

alozowski

Open LLM Leaderboard org 16 days ago

Hi everyone,

Thanks for your messages and activity! Let's start WizardLM-2-8x22B evaluation! 🚀

bullerwins

16 days ago

Hi everyone,

Thanks for your messages and activity! Let's start WizardLM-2-8x22B evaluation! 🚀

great news! I'm really curious how it stacks up. I'm also glad the feedback was heard.

alozowski

Open LLM Leaderboard org 9 days ago

•

edited 9 days ago

Thanks everyone for your activity and patience! WizardLM-2-8x22B is now 8th on the Leaderboard with an average score of 32.61!

alozowski changed discussion status to closed 9 days ago

SnailInf1

6 days ago

Great - thank you for evaluation!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment