Does this work for anyone?

#1
by Downtown-Case - opened

15K downloads, but with the PR all this model does is error out for me, no matter if I use one of these quants or quant it myself.

As written in the readme πŸ˜‰, the model was generated with a branch of llama.cpp that intended to add support for GLM4/GLM3.

Support was merged in master 3 days ago https://github.com/ggerganov/llama.cpp/pull/8031

If the model still doesn't work after you've updated llama.cpp, ping me and I'll do a re-quant.

The PR suggests the 1M version of the model isn't even supported yet, lol.

https://github.com/ggerganov/llama.cpp/pull/8031#issuecomment-2213635819

It doesn't work when I quantize it myself either.

llama_model_load: error loading model: check_tensor_dims: tensor 'blk.0.attn_qkv.weight' has wrong shape; expected 4096, 4608, got 4096, 5120, 1, 1
llama_load_model_from_file: failed to load model

I tried Q4, Q6, Q8 and F16 on Ollama, LlamaCpp and KoboldCpp. Always the same error message.

Ok, thanks for the heads-up! I will add a notice to the readme, and do a re-quant once support is implemented

Sign up or log in to comment