Detected duplicate leading "<|begin_of_text|>" in prompt

#2
by AIGUYCONTENT - opened

Getting this error message in Oobabooga after merging the two Q8 files into one: /home/me/text-generation-webui/installer_files/env/lib/python3.11/site-packages/llama_cpp_cuda_tensorcores/llama.py:1129: RuntimeWarning: Detected duplicate leading "<|begin_of_text|>" in prompt, this will likely reduce response quality, consider removing it...
warnings.warn(
Output generated in 6.82 seconds (7.77 tokens/s, 53 tokens, context 372, seed 891376470)

I just updated Oobabooga today via CLI and as far as I know everything is up-to-date. I have no customizations running at all. 88GB of VRAM.

This feels like outdated llama.cpp (older versions of llama.cpp had some hack to add extra bot toklens) in oobaboga, or something wrong in your local prompt. I don't see where the model would add that.

I can confirm that there is only one BOT token with llama.cpp, so this is a bug in oobabooga (or your custom chat template if you have one).

No custom chat template. As far as I know, everything is stock.

Ok, will delete and re-download in LM Studio.

Sign up or log in to comment