More tokens that the dimension of the model

#9
by pavel-furov-ynp - opened

The model's output is 128256 (so max possible token to generate and analyze is 128257), while the maximum token is:
{
"id": 128256,
"content": "",
"single_word": false,
"lstrip": false,
"rstrip": false,
"normalized": false,
"special": false
}
I think this token is impossible to use at all. It also breaks the guidance mechanism, which tries to "allow" to generate this token, but it does not exist in model's logits.

For example, mistral has 32768 tokens and the maximum token is 32767 and it does not break the guidance

Can somebody explain the purpose of this token?

Sign up or log in to comment