Quantized Versions?

by StopLockingDarkmode - opened 7 days ago

Discussion

StopLockingDarkmode

7 days ago

This model is just out of reach for most consumers with 24gb VRAM. Will you be providing any quantized versions?

yukiarimo

7 days ago

Same

Nurb4000

7 days ago

•

edited 7 days ago

Normally others in the community do that if you are unable to do it yourself. Takes a little bit of time i have noticed, but it does happen.

( ok so im wrong on timing, its already been done. a quick search found a couple of people who have done it already )

JohnCe

7 days ago

Normally others in the community do that if you are unable to do it yourself. Takes a little bit of time i have noticed, but it does happen.

( ok so im wrong on timing, its already been done. a quick search found a couple of people who have done it already )

Really? All I'm seeing is a few GGUFs which are unable to be used for inference at all.

Nurb4000

7 days ago

i didn't try them, but just that i noticed it was already happening.

sonam-shrish

6 days ago

I tried running it on A100(40GB). Still out of memory! :(

barteq

6 days ago

was testing and it took around ~60GB VRAM running the example code

sonam-shrish

6 days ago

60GB is crazy. Does it have any noticably better results than Flux?

mikehemberger

6 days ago

Hi @sonam-shrish ,
Correct me if I’m wrong but Pixtral is a vision-language model while FLUX is an text-to-image model. Meaning the comparison would be like comparing apples and pears.
Best,
M

StopLockingDarkmode

6 days ago

It appears all quantization methods in VLLM rely on transformers.

ONTHEREDTEAM

6 days ago

GGUF version?

sonam-shrish

3 days ago

•

edited 3 days ago

Hi @mikehemberger ,
you're right. I didn't know that Pixtral was only a VLM and does not generate any image.
I did not read the model card properly.
Thanks :)
Best,
Sonam

mikehemberger

3 days ago

You‘re welcome @sonam-shrish ,
I hope at some point mistral AI will also tackle text-to-image, though ;-)

sonam-shrish

3 days ago

Yeah, that would be cool.

sazirarrwth99

3 days ago

Hey guys,

idk if I am just stupid rn, but where did you find quantized versions?

Nurb4000

3 days ago

Hey guys,

idk if I am just stupid rn, but where did you find quantized versions?

I just did a simple search at the top of the HF page. Tho as others have mentioned, they may not be ready for prime time yet.

sipie800

2 days ago

Please stop forcing people to buy higher nvidia cards!

Nurb4000

2 days ago

Please stop forcing people to buy higher nvidia cards!

We all hate to be on the GPU treadmill, but if you want to move forward, larger is just going to be the way. Hopefully in the near future shared ram NPUs will be able to take the strain of many of these larger models and we can skip the GPU.

yukiarimo

2 days ago

Hopefully, not

Nurb4000

1 day ago

Hopefully, not

You hope we don't get cheaper hardware that can take the place of GPU? You work for NVIDIA or something?

yukiarimo

about 23 hours ago

Oh, sorry, I misread it! I mean, hopefully, we’ll get the better GPU/NPU, or whatever, but not the SHARED GPU (like global miners, I saw in some interview)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment