legraphista/glm-4-9b-chat-1m-GGUF · Does this work for anyone?

Does this work for anyone?

by Downtown-Case - opened Jul 7

Jul 7

15K downloads, but with the PR all this model does is error out for me, no matter if I use one of these quants or quant it myself.

legraphista

Owner Jul 10

•

edited Jul 10

As written in the readme 😉, the model was generated with a branch of llama.cpp that intended to add support for GLM4/GLM3.

Support was merged in master 3 days ago https://github.com/ggerganov/llama.cpp/pull/8031

If the model still doesn't work after you've updated llama.cpp, ping me and I'll do a re-quant.

Downtown-Case

Jul 13

•

edited Jul 13

The PR suggests the 1M version of the model isn't even supported yet, lol.

https://github.com/ggerganov/llama.cpp/pull/8031#issuecomment-2213635819

It doesn't work when I quantize it myself either.

Suppenkoch

Jul 13

llama_model_load: error loading model: check_tensor_dims: tensor 'blk.0.attn_qkv.weight' has wrong shape; expected 4096, 4608, got 4096, 5120, 1, 1
llama_load_model_from_file: failed to load model

I tried Q4, Q6, Q8 and F16 on Ollama, LlamaCpp and KoboldCpp. Always the same error message.

legraphista

Owner Jul 14

Ok, thanks for the heads-up! I will add a notice to the readme, and do a re-quant once support is implemented

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment