https://huggingface.co/jondurbin/bagel-dpo-20b-v04-llama

#67
by Cran-May - opened

jondurbin/bagel-dpo-20b-v04-llama and jondurbin/bagel-20b-v04-llama
need iMatrix quants. Because the quant from dranger003 "dranger003/bagel-dpo-20b-v04-llama-iMat.GGUF" may occasionally generate the wrong token and give an end of turn type of token at an inopportune time.(use lm-studio version from 0.2.18 to 0.2.23, reproducible)
And the quant for bagel-dpo-20b-v04 (the unllamafied version) can't be correctly loaded in lm-studio.

Not sure another quant will change anything about these issues,. but sure, they are in the queue. If nothing goes wrong, they should be there within a few hours. Next time, it would help to give URLs for the models.

mradermacher changed discussion status to closed

Unfortunately, llama.cpp crashes when trying to generate the imatrix, and this looks like a bug in the model, so no imatrix quants:

GGML_ASSERT: llama.cpp/llama.cpp:4530: unicode_cpts_from_utf8(word).size() > 0

This might affect the static ones as well.

Yup, affects the static quants as well. I could try with the old convert.py script, but that is unlikely to yield different results than dranger003 has, other than a different imatrix. The model probably has some tokenizer issue, and we have the choice between the model not loading or having subtle bugs.

Sign up or log in to comment