output is merely copy of input for 70b @ webui

#13
by wholehope - opened

Can anybody enlighten me how to inference 70b-GPTQ model (chat or non-chat) using oobabooga/text-generation-webui? No matter I use the LLaMa-v2 instruct mentioned on the model card or just plain prompt, the output is always the exact copy of input. In the same webui, I can inference 13b/7b-GPTQ (chat or non-chat) without any problem.

Also have issues in textgen webui, no tokens generated, only in the chat interface the other one works.

Sign up or log in to comment