Imatrix only request: https://huggingface.co/Steelskull/L3-MS-Astoria-70b

#85
by USM-Valor - opened

Hello!

I have heard positive things about this model and saw that you had already done the static quants (https://huggingface.co/mradermacher/L3-MS-Astoria-70b-GGUF) but not imatrix. If possible, could you please do so for this model? If not, perfectly understandable.

The reason I didn't do imatrix quants is that the model overflowed during imatrix generation (thats why the "you can request imatrix quants..." paragraph is missing). It's possible that a different set of training data succeeds, but then it's quite likely that the model overflows during quantisation. So sorry.

mradermacher changed discussion status to closed

Actually, as an experiment, I've queued it again to see if it might have been due to reduced precision (the imatrix was originally done with an Q8_0 quant, and will now be done on source source precision, in this case, likely f32). Also, some overflow bugs in llama.cpp have been fixed, so let's see if it helps.

Some IQ3/IQ2 quants were generated. That's a very good sign, meaning it has a high chance to succeed. If the IQ1 quants fail I will keep the rest. Should be finished in a few hours.

Fantastic! Can’t wait to try it out.

Worked like a charm, all quants successfully generated, cheers!

Sign up or log in to comment