ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`

#76

by praveeny - opened Jan 11

Jan 11

I was running Phi-2 on my CPU in a Jupyter notebook. When I just tried, it broke :-((

I see that the model has been updated. From the little research I did, apparently, flash_attn requires that I have Nvidia GPU? How do I run this on a CPU now? Or is that no longer an option?

P.S: - I am unable to install flash_attn, I have updated torch, transformers and packages and wheel. Now I see the following error when trying to install this package. I don't have CUDA.

raise OSError('CUDA_HOME environment variable is not set. '
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.

  torch.__version__  = 2.1.2+cpu

aanubhav

Jan 11

Facing the same issue, I'm trying to download model weights and build a docker image with vLLm . It gave the same error. It worked perfectly fine 6 hrs back, but with the the latest commit something seems broken.

In the meantime, how do we pull weights programatically from previous commit-id?

gugarosa

Microsoft org Jan 12

•

edited Jan 12

Hello everyone!

We deployed a fix and it should be working now.

The issue was caused by the combination of using dynamic modules and remote code loading in transformers.

Regards,
Gustavo.

aanubhav

Jan 12

Thanks @gugarosa

aanubhav

Jan 12

•

edited Jan 12

@gugarosa Hey just curious to understand the motivation behind renaming of layer_norm_epsilon to layer_norm_eps in the config.json?

I see vLLM use layer_norm_epsilon throughout all the models. So, now the recent commits in this repo is breaking things in vLLM

gugarosa

Microsoft org Jan 12

I think we will need to update vLLM as well.

There is no reason in using layer_norm_eps. It was used in the first implementation of Phi (internally in transformers) and we followed it minimize friction when merging the integration.

gugarosa

Microsoft org Jan 12

By the way, there is an active PR that will fix it: https://github.com/vllm-project/vllm/pull/2428/files

vince62s

Jan 12

since the layernaming was changed for consistency reasons, don't you think it would be better to align with "layer_norm_epsilon" too ?
on the other hand llama uses "rms_norm_eps" .... go figure.

gugarosa

Microsoft org Jan 12

•

edited Jan 12

I definitely agree!

Maybe an attribute_map: {"layer_norm_epsilon": "layer_norm_eps"} on the configuration_phi.py would fix the issue. And it would be an easier PR.

praveeny changed discussion status to closed Jan 16

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment