Edit model card

exp2-led-risalah_data_v4

This model is a fine-tuned version of silmi224/finetune-led-35000 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8431
  • Rouge1: 16.5193
  • Rouge2: 8.3503
  • Rougel: 11.7271
  • Rougelsum: 15.6162

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 300
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum
3.3717 1.0 10 2.9094 8.8016 2.3126 6.2771 8.3716
3.3649 2.0 20 2.8898 9.2296 2.5864 6.5169 8.8408
3.3317 3.0 30 2.8578 9.4144 2.7476 6.7319 8.9607
3.2876 4.0 40 2.8156 9.2048 2.6478 6.8107 8.8212
3.2244 5.0 50 2.7651 7.4966 2.3382 5.9094 6.9392
3.1638 6.0 60 2.7088 8.8105 2.6633 6.809 8.3272
3.087 7.0 70 2.6486 9.3756 2.6957 7.2067 9.0197
3.0201 8.0 80 2.5859 9.5975 2.7885 6.9418 9.0329
2.9335 9.0 90 2.5224 9.5107 2.374 6.8494 8.9865
2.8603 10.0 100 2.4585 9.8073 2.8793 7.4445 9.4102
2.7774 11.0 110 2.3954 10.604 2.8025 7.8035 10.1927
2.7011 12.0 120 2.3347 10.3728 3.4421 7.8112 9.5918
2.634 13.0 130 2.2783 11.0596 3.3087 7.9686 10.047
2.5608 14.0 140 2.2253 12.4204 4.4276 8.5552 11.4364
2.4866 15.0 150 2.1782 12.8046 4.4267 8.8782 12.2253
2.4349 16.0 160 2.1369 13.0668 4.3763 8.7619 12.104
2.3851 17.0 170 2.1012 13.7679 4.6022 9.1874 12.7284
2.3302 18.0 180 2.0691 13.2512 4.6911 9.3187 11.8059
2.2836 19.0 190 2.0403 14.3491 5.7839 9.8346 13.3638
2.236 20.0 200 2.0150 13.9778 4.9493 9.5799 12.6063
2.1965 21.0 210 1.9910 14.0795 5.1926 9.3653 13.3801
2.1586 22.0 220 1.9704 14.1261 5.9801 9.7882 13.503
2.1325 23.0 230 1.9513 14.3575 6.0074 9.6053 13.672
2.099 24.0 240 1.9332 15.6132 6.3777 10.3533 14.9225
2.0703 25.0 250 1.9141 16.145 6.8437 10.6729 15.0299
2.0438 26.0 260 1.8984 15.3881 6.5977 10.048 14.7873
2.0187 27.0 270 1.8846 14.1595 6.3778 9.4685 13.3986
1.9954 28.0 280 1.8693 14.2631 6.3966 10.4774 13.4271
1.9723 29.0 290 1.8576 15.878 6.6511 10.8733 14.6417
1.9465 30.0 300 1.8431 16.5193 8.3503 11.7271 15.6162

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.2
  • Datasets 2.19.2
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
162M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for silmi224/exp2-led-risalah_data_v4

Finetuned
this model