mradermacher/Palmyra-Med-70B-GGUF · Detected duplicate leading "<|begin_of

Aug 3

Getting this error message in Oobabooga after merging the two Q8 files into one: /home/me/text-generation-webui/installer_files/env/lib/python3.11/site-packages/llama_cpp_cuda_tensorcores/llama.py:1129: RuntimeWarning: Detected duplicate leading "<|begin_of_text|>" in prompt, this will likely reduce response quality, consider removing it...
warnings.warn(
Output generated in 6.82 seconds (7.77 tokens/s, 53 tokens, context 372, seed 891376470)

I just updated Oobabooga today via CLI and as far as I know everything is up-to-date. I have no customizations running at all. 88GB of VRAM.

mradermacher

Owner Aug 3

This feels like outdated llama.cpp (older versions of llama.cpp had some hack to add extra bot toklens) in oobaboga, or something wrong in your local prompt. I don't see where the model would add that.

mradermacher

Owner Aug 3

•

edited Aug 3

I can confirm that there is only one BOT token with llama.cpp, so this is a bug in oobabooga (or your custom chat template if you have one).

AIGUYCONTENT

Aug 3

No custom chat template. As far as I know, everything is stock.

Ok, will delete and re-download in LM Studio.