[Code Help] quick start code snippet taking too long to generate a response

#194
by ppoptart - opened

Hi, I have tried running the code below on both my local VS code and a google Colab but it is taking very long to run/never completes generating. Can someone help me fix this, or is this normal behaviour?

import transformers
import torch
access_token = 'MY_TOKEN'

model_id = "meta-llama/Meta-Llama-3-8B"

pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto",token=access_token
)
pipeline("Hey how are you doing today?")

I have the same problem. My platform is a cloud virtual machine with 32GB memory, 16 Core AMD CPU, no GPU. It takes about 30-60 minutes to generate the answer. If anyone has the optimization method, discuss with me plz. Thanks a lot.

Sign up or log in to comment