Reward model returns 0 scores for all cases

by iseesaw - opened 13 days ago

13 days ago

Thanks for your wonderful model!

Could you please help see this issue? When running the Skywork reward model on multiple GPUs (4x A6000), all reward scores return 0, unlike the non-zero scores in the official single-GPU example.

transformers                       4.44.2

chrisliu298

Skywork org 13 days ago

Hi,

Are you using the model across multiple GPUs in a pipeline-parallel or data-parallel configuration? Can you share the code that reproduces the error?

iseesaw

13 days ago

Just based on the provided code example, and set device_map="auto" (likely pipeline-parallel)

chrisliu298

Skywork org 13 days ago

I ran the code on 2x, 4x, and 8x A800 but couldn't reproduce the problem.

We suggest installing transformers from the source and upgrading flash-attention to the latest version. Additionally, you could try setting attn_implementation to eager to see if it resolves the issue.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment