Not passing attention_mask in model.generate

#25
by hcy511 - opened

Hi, I wonder why there's no need to pass the attention_mask (the commented line below) in model.generate during inference. Thanks!

outputs = self.model.generate(
input_ids=model_inputs['input_ids'],
pixel_values=model_inputs['pixel_values'],
# attention_mask=model_inputs['attention_mask'],
max_new_tokens=100,
early_stopping=False,
do_sample=False,
)

Microsoft org

hi, Florence-2 language model is encoder-decoder, and the attention_mask for inputs are all ones

But wouldn't we want an attention mask for padded tokens? We don't want to attend over padded tokens in the encoder or am I misunderstanding?

Sign up or log in to comment