Imatrix only request: https://huggingface.co/Steelskull/L3-MS-Astoria-70b

#85

by USM-Valor - opened Jun 5

Jun 5

Hello!

I have heard positive things about this model and saw that you had already done the static quants (https://huggingface.co/mradermacher/L3-MS-Astoria-70b-GGUF) but not imatrix. If possible, could you please do so for this model? If not, perfectly understandable.

mradermacher

Owner Jun 5

The reason I didn't do imatrix quants is that the model overflowed during imatrix generation (thats why the "you can request imatrix quants..." paragraph is missing). It's possible that a different set of training data succeeds, but then it's quite likely that the model overflows during quantisation. So sorry.

mradermacher changed discussion status to closed Jun 5

mradermacher

Owner Jun 5

•

edited Jun 5

Actually, as an experiment, I've queued it again to see if it might have been due to reduced precision (the imatrix was originally done with an Q8_0 quant, and will now be done on source source precision, in this case, likely f32). Also, some overflow bugs in llama.cpp have been fixed, so let's see if it helps.

mradermacher

Owner Jun 6

Some IQ3/IQ2 quants were generated. That's a very good sign, meaning it has a high chance to succeed. If the IQ1 quants fail I will keep the rest. Should be finished in a few hours.

USM-Valor

Jun 6

Fantastic! Can’t wait to try it out.

mradermacher

Owner Jun 6

Worked like a charm, all quants successfully generated, cheers!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment