e3_lr2e-05 / README.md
lailamt's picture
Training in progress, step 100
b9599f5 verified
|
raw
history blame
No virus
2.44 kB
metadata
license: mit
base_model: FacebookAI/xlm-roberta-base
tags:
  - generated_from_trainer
model-index:
  - name: e3_lr2e-05
    results: []

e3_lr2e-05

This model is a fine-tuned version of FacebookAI/xlm-roberta-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6436

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.9961 0.1404 100 1.9416
2.0472 0.2808 200 1.8589
1.9766 0.4212 300 1.8095
1.9319 0.5616 400 1.7736
1.897 0.7021 500 1.7447
1.8743 0.8425 600 1.7370
1.86 0.9829 700 1.7156
1.8431 1.1233 800 1.7071
1.8217 1.2637 900 1.6939
1.8212 1.4041 1000 1.6900
1.8053 1.5445 1100 1.6774
1.7899 1.6849 1200 1.6736
1.799 1.8254 1300 1.6644
1.7845 1.9658 1400 1.6559
1.7704 2.1062 1500 1.6531
1.776 2.2466 1600 1.6528
1.773 2.3870 1700 1.6417
1.7632 2.5274 1800 1.6452
1.7451 2.6678 1900 1.6460
1.7505 2.8088 2000 1.6455
1.7602 2.9492 2100 1.6399

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1