Slow to load tokenizer

#2
by gptzerozero - opened

Anyone notice it takes a long time (2 minutes) to load the tokenizer for this GPTQ model, but other GPTQ models like TheBloke/WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ loads within a second (100ms).

model_id = path_to_downloaded_models/TheBloke_LongChat-13B-GPTQ
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True) 

Oh, I never uploaded a fast tokenizer for this. I'll sort that out now

Done, trigger a download of the model again and it'll download tokenizer.json and then it will load instantly.

Hello, is there 8bit version for gptq?

Sign up or log in to comment