Bart-base / README.md
tuquyennnn's picture
End of training
82b51b3 verified
|
raw
history blame
No virus
4.61 kB
metadata
license: apache-2.0
base_model: facebook/bart-base
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: bart-base
    results: []

bart-base

This model is a fine-tuned version of facebook/bart-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4156
  • Rouge1: 41.7881
  • Rouge2: 19.9952
  • Rougel: 36.4308
  • Rougelsum: 38.1089
  • Gen Len: 18.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.3288 1.0 1557 0.2425 41.3515 19.3193 35.4878 38.2323 18.0
0.2455 2.0 3115 0.2323 41.1865 19.5825 35.7587 37.8016 18.0
0.2097 3.0 4672 0.2333 41.4261 20.2577 36.0437 38.1275 18.0
0.1818 4.0 6230 0.2400 42.7857 21.7524 37.4117 39.3948 18.0
0.1591 5.0 7787 0.2489 41.9402 21.3931 36.7999 38.7229 18.0
0.1392 6.0 9345 0.2530 42.1993 21.3725 36.6614 38.5415 18.0
0.1218 7.0 10902 0.2616 42.0991 20.7834 36.6721 38.6425 18.0
0.1061 8.0 12460 0.2794 41.4682 20.2185 36.0528 37.7626 18.0
0.0929 9.0 14017 0.2858 41.5178 20.0354 36.027 37.9562 18.0
0.0813 10.0 15575 0.3001 42.1686 20.7936 36.7589 38.6885 18.0
0.0715 11.0 17132 0.3113 41.5616 20.6733 36.2947 38.1556 18.0
0.0622 12.0 18690 0.3228 41.3672 20.0432 36.1746 38.0949 18.0
0.0544 13.0 20247 0.3296 41.4662 19.8484 35.9521 37.7284 18.0
0.0478 14.0 21805 0.3373 41.1417 20.1208 36.1864 37.9314 18.0
0.0423 15.0 23362 0.3440 41.1174 19.551 35.7777 37.5518 18.0
0.0373 16.0 24920 0.3581 40.7365 19.5894 35.5672 37.4447 18.0
0.0327 17.0 26477 0.3654 41.0895 19.4995 35.7195 37.3265 18.0
0.0294 18.0 28035 0.3750 40.8447 19.4098 35.557 37.3456 18.0
0.0262 19.0 29592 0.3790 41.0388 19.8022 35.946 37.6522 18.0
0.0237 20.0 31150 0.3841 41.6747 19.6307 35.9938 37.6853 18.0
0.0212 21.0 32707 0.3874 40.7796 19.2156 35.3642 37.1609 18.0
0.0192 22.0 34265 0.3942 41.2411 19.5756 35.8442 37.5498 18.0
0.0173 23.0 35822 0.3974 41.112 19.7216 35.8072 37.5629 18.0
0.0159 24.0 37380 0.4042 40.6911 19.1988 35.5312 37.3276 18.0
0.0144 25.0 38937 0.4090 41.0017 19.3834 35.7806 37.6217 18.0
0.0132 26.0 40495 0.4101 41.6159 19.4447 36.1746 37.9271 18.0
0.012 27.0 42052 0.4117 41.4618 19.3824 36.0425 37.8597 18.0
0.0112 28.0 43610 0.4137 41.5302 19.565 36.1323 37.8484 18.0
0.0105 29.0 45167 0.4147 41.5432 19.9581 36.2526 38.0642 18.0
0.0099 29.99 46710 0.4156 41.7881 19.9952 36.4308 38.1089 18.0

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.19.1
  • Tokenizers 0.15.2