Can't download the tokenizer

#15

by alerio - opened May 17, 2023

May 17, 2023

Hi, I was trying to download the model and tokenizer via the following code

model_name = "hivemind/gpt-j-6B-8bit"
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
    model_name, 
     load_in_8bit=True, # NOTE: load GPT-2 with 8 bit did not work
     device_map={'':torch.cuda.current_device()},
    )

But I got the this error: Can't load tokenizer for 'hivemind/gpt-j-6B-8bit'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'hivemind/gpt-j-6B-8bit' is the correct path to a directory containing all relevant files for a GPT2TokenizerFast tokenizer.

Any help will be appreciated!

interstellarninja

May 18, 2023

•

edited May 18, 2023

Hey Alerio, I tried the following code from this example: https://colab.research.google.com/drive/1qOjXfQIAULfKvZqwCen8-MoWKGdSatZ4#scrollTo=W8tQtyjp75O

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "hivemind/gpt-j-6B-8bit"
model_8bit = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", load_in_8bit=True)
tokenizer = AutoTokenizer.from_pretrained(model_name)

But I still get the following error:

NameError: name 'init_empty_weights' is not defined

interstellarninja

May 18, 2023

BTW following works but crashes because of OOM:

class GPTJForCausalLM(transformers.models.gptj.modeling_gptj.GPTJForCausalLM):
    def __init__(self, config):
        super().__init__(config)
        convert_to_int8(self)

gpt = GPTJForCausalLM.from_pretrained("hivemind/gpt-j-6B-8bit", low_cpu_mem_usage=True)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment