Strange formatting and issues with token counting?

#2
by MarinaraSpaghetti - opened

Already reported this on the Discord server, but wanted to ask if anyone else is experiencing issues with the model outputting spaces in incorrect places (after the new line, or before the end of the quotation)? It also seems to be producing outputs longer than the one set in response length. Honestly, feels like something is off with the tokenizer, but I wanted to make sure I'm not the one experiencing these issues.
I am running exl2 quants and those occur on both the 'official' quants, and the ones from Bartowski. On SillyTavern, I have the 'API' tokenizer set as my default, with 0 token padding.

Screenshot 2024-07-04 at 01.31.18.png

Screenshot 2024-07-04 at 01.37.15.png

Screenshot 2024-07-04 at 01.37.20.png

Sign up or log in to comment