Context Length 32k tokens ?

#2
by fuckfuckyou11 - opened

Why this GGUF has context length in description 32k? Here https://huggingface.co/Qwen/Qwen2.5-72B-Instruct it states 131k context length. What happened?

Qwen org

Have llama.cpp supported YaRN yet? If it has, enabling YaRN as with the original model in its modelcard should extend the context length.

Have llama.cpp supported YaRN yet? If it has, enabling YaRN as with the original model in its modelcard should extend the context length.

Should it? I have never heard about Yarn, I tried to find issues in llama.cpp github repo, still nothing , neither opened or closed issue. If it supports,so my original question, why 32k context length in description still?

Qwen org

128K context length needs YaRN (that's what we have tested). no YaRN no 128K.

If you use other methods to extend the context length, they may work also. But we don't really know.

llama.cpp got yarn support of some kind merged before Nov 4, 2023 https://github.com/ggerganov/llama.cpp/discussions/2963#discussioncomment-7475016

I suggest directing queries to the github.com discussions or issues pages.

I also find some discussion here: https://github.com/ggerganov/llama.cpp/discussions/7416

awesome..so no reason to state 32k in the description if llama.cpp supports yarn since 11/2023 and 128K by default.

Sign up or log in to comment