Could you let me know when the bfloat16 model will be uploaded? I can't run the float32 model!

by Cach - opened 3 days ago

Discussion

Cach

3 days ago

Could you let me know when the bfloat16 model will be uploaded? I can't run the float32 model!

chrisc36

Ai2 org 2 days ago

•

edited 2 days ago

Yes, we would like to build a bfloat16 compatible version. In the meantime you can run this model with torch.autocast to save some memory:

with torch.autocast("cuda", enabled=True, dtype=autocast_precision):

We did our evaluations in that setting. (float32 weights with autocast enabled)

sanghol

Ai2 org 2 days ago

The current code does not support bfloat16 inference directly, but you can try with torch.autocast.

import torch
with torch.autocast("cuda", enabled=True, dtype=torch.bfloat16):
    output = model.generate_from_batch(
        inputs,
        GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
        tokenizer=processor.tokenizer
    )

Note that the weights will still be in float32.

verityw

1 day ago

Will a bfloat16 version be released some point in the future though?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment