Edit model card

flan-t5-s

This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2736
  • Rouge1: 40.152
  • Rouge2: 15.8816
  • Rougel: 33.4399
  • Rougelsum: 35.9029
  • Gen Len: 19.886

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 6

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.347 1.0 2307 0.2918 38.3203 14.7065 31.7739 34.536 19.904
0.2765 2.0 4615 0.2817 38.9417 15.3147 32.5082 35.1789 19.884
0.2683 3.0 6922 0.2776 39.3458 15.3133 32.7661 35.2993 19.878
0.2635 4.0 9230 0.2751 39.7671 15.7051 33.1173 35.6438 19.884
0.2611 5.0 11537 0.2738 39.8607 15.5855 33.1643 35.6319 19.882
0.2592 6.0 13842 0.2736 40.152 15.8816 33.4399 35.9029 19.886

Framework versions

  • Transformers 4.36.1
  • Pytorch 2.1.2
  • Datasets 2.20.0
  • Tokenizers 0.15.2
Downloads last month
3
Safetensors
Model size
77M params
Tensor type
F32
Β·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for dtruong46me/flan-t5-s

Finetuned
this model

Spaces using dtruong46me/flan-t5-s 2