models

hub = {
'HF_MODEL_ID':'EleutherAI/gpt-neox-20b',
'HF_TASK':'text-generation'
}

create Hugging Face Model Class

huggingface_model = HuggingFaceModel(
transformers_version='4.17.0',
pytorch_version='1.10.2',
py_version='py38',
env=hub,
role=role,
)

deploy model to SageMaker Inference

predictor = huggingface_model.deploy(
initial_instance_count=1, # number of instances
instance_type='ml.m5.xlarge' # ec2 instance type
)

predictor.predict({
'inputs': "Can you please let us know more details about your "
})

=====================

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from primary with message "{
"code": 400,
"type": "InternalServerException",
"message": "\u0027gpt_neox\u0027"
}

The corresponding logs:
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - config = AutoConfig.from_pretrained(model, revision=revision, _from_pipeline=task, **model_kwargs)
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 657, in from_pretrained
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - config_class = CONFIG_MAPPING[config_dict["model_type"]]
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.8/site-packages/transformers/models/auto/configuration_auto.py", line 372, in getitem
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise KeyError(key)
W-EleutherAI__gpt-neox-20b-1-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - KeyError: 'gpt_neox'

Any idea how to solve it?

panrodrigo

Dec 29, 2022

Same issue here.

kesar

Dec 29, 2022

Same issue here. Any help?

Justlearn

Jan 2, 2023

Same issue on my sagemaker

ml-workshops

Jan 11, 2023

Hey folks I just saw this as well. Will triage.

ml-workshops

Jan 12, 2023

I am pretty sure the core issue here is transformers version. GPT-Neo was released in a later version of the transformers sdk. When you use the boilerplate provided above, which is 4.17, this doesn't have neo. When I use a more recent version locally on my notebook, 4.25, I can use neo without an issue.

To solve this on the hosting side of things, bring a requirements.txt file and point to a more recent version of the transformers sdk.

You can build a custom image and point to it, or you can pass an entry point with a requirements.txt.
https://sagemaker.readthedocs.io/en/stable/api/inference/model.html

stellaathena changed discussion status to closed Feb 7, 2023

EleutherAI
/

gpt-neox-20b

SageMaker deploy failure - KeyError: 'gpt_neox'

Hub Model configuration. https://huggingface.co/models

create Hugging Face Model Class

deploy model to SageMaker Inference