metadata

license: mit
base_model: neuralmind/bert-base-portuguese-cased
tags:
  - generated_from_trainer
model-index:
  - name: e3_lr2e-05
    results: []

e3_lr2e-05

This model is a fine-tuned version of neuralmind/bert-base-portuguese-cased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.5721

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
2.2771	0.0707	100	1.9875
2.0486	0.1414	200	1.8946
1.993	0.2121	300	1.8415
1.9532	0.2828	400	1.8133
1.9145	0.3535	500	1.7807
1.8872	0.4242	600	1.7534
1.8593	0.4949	700	1.7357
1.8447	0.5656	800	1.7173
1.8149	0.6363	900	1.7074
1.7966	0.7070	1000	1.7036
1.8034	0.7777	1100	1.6883
1.7854	0.8484	1200	1.6740
1.7779	0.9191	1300	1.6642
1.7706	0.9897	1400	1.6582
1.7723	1.0604	1500	1.6475
1.746	1.1311	1600	1.6463
1.7386	1.2018	1700	1.6399
1.7319	1.2725	1800	1.6385
1.7292	1.3432	1900	1.6230
1.7121	1.4139	2000	1.6204
1.7245	1.4846	2100	1.6152
1.7159	1.5553	2200	1.6103
1.7232	1.6260	2300	1.6114
1.6952	1.6967	2400	1.6099
1.6944	1.7674	2500	1.6012
1.6991	1.8381	2600	1.5970
1.6954	1.9088	2700	1.5933
1.698	1.9795	2800	1.5918
1.6857	2.0502	2900	1.5915
1.6783	2.1209	3000	1.5840
1.679	2.1916	3100	1.5817
1.6796	2.2623	3200	1.5835
1.6709	2.3330	3300	1.5769
1.6626	2.4037	3400	1.5819
1.6732	2.4744	3500	1.5824
1.6726	2.5458	3600	1.5720
1.6822	2.6165	3700	1.5758
1.6578	2.6872	3800	1.5739
1.6756	2.7579	3900	1.5743
1.6747	2.8286	4000	1.5695
1.659	2.8993	4100	1.5713
1.6587	2.9700	4200	1.5750

Framework versions

Transformers 4.41.2
Pytorch 2.3.0+cu121
Datasets 2.19.2
Tokenizers 0.19.1