e3_lr2e-05 / README.md
lailamt's picture
Training in progress, step 100
1c15f62 verified
|
raw
history blame
No virus
2.81 kB
metadata
license: mit
base_model: PORTULAN/albertina-ptbr-base
tags:
  - generated_from_trainer
model-index:
  - name: e3_lr2e-05
    results: []

e3_lr2e-05

This model is a fine-tuned version of PORTULAN/albertina-ptbr-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9281

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 256
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.4061 0.1040 100 1.1920
1.2553 0.2080 200 1.1209
1.2102 0.3120 300 1.0971
1.1773 0.4160 400 1.0738
1.1432 0.5200 500 1.0481
1.1302 0.6240 600 1.0320
1.1153 0.7280 700 1.0243
1.1057 0.8320 800 1.0107
1.0976 0.9360 900 1.0002
1.0889 1.0400 1000 0.9907
1.0797 1.1440 1100 0.9836
1.0633 1.2480 1200 0.9788
1.0582 1.3521 1300 0.9761
1.0578 1.4561 1400 0.9635
1.0423 1.5601 1500 0.9601
1.0411 1.6641 1600 0.9578
1.0406 1.7681 1700 0.9527
1.0436 1.8721 1800 0.9520
1.0363 1.9761 1900 0.9443
1.0274 2.0801 2000 0.9419
1.03 2.1841 2100 0.9417
1.0232 2.2881 2200 0.9392
1.0237 2.3921 2300 0.9374
1.0199 2.4961 2400 0.9354
1.0095 2.6001 2500 0.9399
1.0145 2.7041 2600 0.9343
1.0179 2.8081 2700 0.9297
1.0148 2.9121 2800 0.9328

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.2
  • Tokenizers 0.19.1