SageMaker deploy failure - KeyError: 'gpt_neox'

#8
by mldavid101 - opened

I'm trying to run the sample code on SageMaker but the predictor fails:

from sagemaker.huggingface import HuggingFaceModel
import sagemaker

role = sagemaker.get_execution_role()

Hub Model configuration. https://huggingface.co/models

hub = {
'HF_MODEL_ID':'EleutherAI/gpt-neox-20b',
'HF_TASK':'text-generation'
}

create Hugging Face Model Class

huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)

deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
'inputs': "Can you please let us know more details about your "
})

=====================

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027gpt_neox\u0027"
}

The corresponding logs:
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - config = AutoConfig.from_pretrained(model, revision=revision, _from_pipeline=task, **model_kwargs)
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 657, in from_pretrained
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - config_class = CONFIG_MAPPING[config_dict["model_type"]]
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 372, in getitem
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise KeyError(key)
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - KeyError: 'gpt_neox'

Any idea how to solve it?

Same issue here.

Same issue here. Any help?

Same issue on my sagemaker

Hey folks I just saw this as well. Will triage.

I am pretty sure the core issue here is transformers version. GPT-Neo was released in a later version of the transformers sdk. When you use the boilerplate provided above, which is 4.17, this doesn't have neo. When I use a more recent version locally on my notebook, 4.25, I can use neo without an issue.

To solve this on the hosting side of things, bring a requirements.txt file and point to a more recent version of the transformers sdk.

You can build a custom image and point to it, or you can pass an entry point with a requirements.txt.
https://sagemaker.readthedocs.io/en/stable/api/inference/model.html

stellaathena changed discussion status to closed

Sign up or log in to comment