The model stopped working (responds gibberish)

#8
by jfaiofj92459 - opened

I use the code from the provided example:

from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig

model_id = "mistralai/Mixtral-8x22B-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
config = AutoConfig.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=config.torch_dtype)

text = "Hello my name is"
inputs = tokenizer(text, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Output:

Hello my name is becichropolHowever ther FrancesCatalog ignorantPB crít ragased crít rush appearanceióences studrefixHoweverCTXlabelsutil� reallySchich Zent dancingachment warehouseAIdhdOR asleepdhdARCHest nights additionalPropertyraisingcircle    dhd panic werdhd%%%%dhdteeest distancesiftPropertyraisingcircle    

Library versions:

  • transformers==4.42.3
  • huggingface-hub==0.23.4

Devices:

  • NVIDIA H100s

I get exactly the same issue. I wonder if it's related to the recent change to the tokenizer?

Does it work if you change vocab_size in config.json to 32768? (and possibly replace the tokenizer with the one from the instruct version)

Hi, I met the same issue. I think it is because the Mixtral-8x22B-v0.1 has tokenizer mismatch problem. You can have correct output by using the Mixtral-8x7B-v0.1 tokenizer instead:

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mixtral-8x7B-v0.1")

Or use the tokenizer as this commit indicates: https://huggingface.co/mistralai/Mixtral-8x22B-v0.1/discussions/10

Mistral AI_ org

Should be fixed now! Sorry for the delay!

Thanks for the fix and ping @pandora-s . Works perfectly now.

jfaiofj92459 changed discussion status to closed

Sign up or log in to comment