Not passing attention_mask in model.generate
#25
by
hcy511
- opened
Hi, I wonder why there's no need to pass the attention_mask (the commented line below) in model.generate during inference. Thanks!
outputs = self.model.generate(
input_ids=model_inputs['input_ids'],
pixel_values=model_inputs['pixel_values'],
# attention_mask=model_inputs['attention_mask'],
max_new_tokens=100,
early_stopping=False,
do_sample=False,
)
hi, Florence-2 language model is encoder-decoder, and the attention_mask for inputs are all ones
But wouldn't we want an attention mask for padded tokens? We don't want to attend over padded tokens in the encoder or am I misunderstanding?