bartowski/Ministral-8B-Instruct-2410-HF-GGUF-TEST · General discussion and feedback.

xxx31dingdong

7 days ago

•

edited 7 days ago

Ministral 8B has a special interleaved sliding-window attention pattern for faster and memory-efficient inference.

Please provide feedback if this GGUF works or not.

bartowski

Owner 7 days ago

It seems to work fine at low context, some have reported oddities at long context, and others have reported subpar performance from the original model being hosted in an HF space, so it's hard to be certain if the GGUF is broken or the original model

So far though I can reasonably say that at low context it works as expected

As things develop I will update this card, or pull the model if I receive other negative feedback showing bad performance, but initial testing is promising

merileo

5 days ago

I have a somewhat long context, and unfortunately I am running into issues with the generated output stopping after a few sentences (and often in the middle of a sentence).

ZeroWw

3 days ago

•

edited 3 days ago

@bartowski how did you convert this?

INFO:hf-to-gguf:Loading model: Ministral-8B-Instruct-2410
Traceback (most recent call last):
  File "/content/llama.cpp/convert_hf_to_gguf.py", line 4430, in <module>
    main()
  File "/content/llama.cpp/convert_hf_to_gguf.py", line 4398, in main
    hparams = Model.load_hparams(dir_model)
  File "/content/llama.cpp/convert_hf_to_gguf.py", line 462, in load_hparams
    with open(dir_model / "config.json", "r", encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'Ministral-8B-Instruct-2410/config.json'

Oh wait.. you got from this: prince-canuma/Ministral-8B-Instruct-2410-HF
Not from mistralai repo