roequitz's picture
End of training
d6df988 verified
metadata
license: apache-2.0
base_model: google-t5/t5-base
tags:
  - generated_from_trainer
model-index:
  - name: t5-abs-2309-1054-lr-1e-05-bs-2-maxep-20
    results: []

t5-abs-2309-1054-lr-1e-05-bs-2-maxep-20

This model is a fine-tuned version of google-t5/t5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.1057
  • Rouge/rouge1: 0.4734
  • Rouge/rouge2: 0.2314
  • Rouge/rougel: 0.4044
  • Rouge/rougelsum: 0.4048
  • Bertscore/bertscore-precision: 0.8983
  • Bertscore/bertscore-recall: 0.8989
  • Bertscore/bertscore-f1: 0.8984
  • Meteor: 0.4395
  • Gen Len: 41.1

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge/rouge1 Rouge/rouge2 Rouge/rougel Rouge/rougelsum Bertscore/bertscore-precision Bertscore/bertscore-recall Bertscore/bertscore-f1 Meteor Gen Len
0.0048 1.0 217 4.0191 0.4796 0.2348 0.4105 0.4113 0.8989 0.8999 0.8993 0.445 41.1636
0.0019 2.0 434 4.0490 0.4749 0.2307 0.406 0.4074 0.8979 0.8986 0.8981 0.4412 40.8364
0.0062 3.0 651 4.0644 0.4795 0.2336 0.4078 0.4094 0.898 0.9 0.8988 0.4468 41.9
0.0062 4.0 868 4.0660 0.4789 0.2299 0.4056 0.4062 0.8986 0.899 0.8986 0.4406 41.1909
0.0114 5.0 1085 4.0761 0.4755 0.2298 0.4046 0.405 0.899 0.8991 0.8989 0.4421 40.8182
0.0106 6.0 1302 4.0854 0.4732 0.2267 0.401 0.4021 0.8982 0.8992 0.8986 0.4401 41.1273
0.0112 7.0 1519 4.0993 0.4706 0.2273 0.4008 0.402 0.8965 0.8987 0.8975 0.4396 41.7182
0.0108 8.0 1736 4.0949 0.4696 0.2269 0.3982 0.399 0.8971 0.8987 0.8978 0.442 41.8727
0.0109 9.0 1953 4.0946 0.4742 0.2304 0.4035 0.4037 0.8982 0.8992 0.8986 0.4447 41.3364
0.0103 10.0 2170 4.1017 0.4769 0.2333 0.4064 0.4068 0.8988 0.8996 0.8991 0.4469 41.1182
0.0102 11.0 2387 4.1028 0.4742 0.2304 0.4032 0.4037 0.898 0.8991 0.8984 0.444 41.4545
0.0101 12.0 2604 4.1046 0.4778 0.233 0.4074 0.4078 0.8987 0.8993 0.8989 0.445 40.9182
0.0097 13.0 2821 4.1067 0.4734 0.2296 0.4034 0.4038 0.8979 0.8985 0.8981 0.4396 41.0
0.0092 14.0 3038 4.1086 0.4727 0.229 0.4022 0.4027 0.8979 0.8984 0.898 0.4395 41.0818
0.0094 15.0 3255 4.1076 0.4727 0.2288 0.4025 0.403 0.8978 0.8984 0.898 0.439 41.1091
0.0094 16.0 3472 4.1075 0.4733 0.2284 0.4024 0.4033 0.8976 0.8987 0.898 0.4389 41.2636
0.0088 17.0 3689 4.1072 0.473 0.2291 0.4034 0.4036 0.8981 0.8986 0.8982 0.4375 41.2545
0.0092 18.0 3906 4.1065 0.4712 0.2298 0.4023 0.4024 0.8981 0.8983 0.898 0.4367 40.9818
0.0095 19.0 4123 4.1058 0.4708 0.2288 0.4022 0.4026 0.8979 0.8986 0.8981 0.4368 41.3273
0.0091 20.0 4340 4.1057 0.4734 0.2314 0.4044 0.4048 0.8983 0.8989 0.8984 0.4395 41.1

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0
  • Datasets 2.21.0
  • Tokenizers 0.19.1