Reward model returns 0 scores for all cases

#1
by iseesaw - opened

Thanks for your wonderful model!

Could you please help see this issue? When running the Skywork reward model on multiple GPUs (4x A6000), all reward scores return 0, unlike the non-zero scores in the official single-GPU example.

transformers                       4.44.2
Skywork org

Hi,

Are you using the model across multiple GPUs in a pipeline-parallel or data-parallel configuration? Can you share the code that reproduces the error?

Just based on the provided code example, and set device_map="auto" (likely pipeline-parallel)

Skywork org

I ran the code on 2x, 4x, and 8x A800 but couldn't reproduce the problem.

We suggest installing transformers from the source and upgrading flash-attention to the latest version. Additionally, you could try setting attn_implementation to eager to see if it resolves the issue.

Sign up or log in to comment