distilroberta-base-reuters-bloomberg
This model is a fine-tuned version of distilroberta-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.2863
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7.2115e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.12
- num_epochs: 30
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.0326 | 1.0 | 6953 | 1.8326 |
1.8515 | 2.0 | 13906 | 1.6887 |
1.7835 | 3.0 | 20859 | 1.6311 |
1.7325 | 4.0 | 27812 | 1.5970 |
1.6869 | 5.0 | 34765 | 1.5563 |
1.6469 | 6.0 | 41718 | 1.5195 |
1.6057 | 7.0 | 48671 | 1.4967 |
1.581 | 8.0 | 55624 | 1.4782 |
1.5587 | 9.0 | 62577 | 1.4598 |
1.54 | 10.0 | 69530 | 1.4417 |
1.5175 | 11.0 | 76483 | 1.4288 |
1.5001 | 12.0 | 83436 | 1.4214 |
1.4848 | 13.0 | 90389 | 1.4043 |
1.4779 | 14.0 | 97342 | 1.3957 |
1.4625 | 15.0 | 104295 | 1.3844 |
1.4448 | 16.0 | 111248 | 1.3753 |
1.4303 | 17.0 | 118201 | 1.3674 |
1.4266 | 18.0 | 125154 | 1.3587 |
1.4116 | 19.0 | 132107 | 1.3485 |
1.4042 | 20.0 | 139060 | 1.3388 |
1.3943 | 21.0 | 146013 | 1.3326 |
1.3838 | 22.0 | 152966 | 1.3262 |
1.3725 | 23.0 | 159919 | 1.3213 |
1.3545 | 24.0 | 166872 | 1.3127 |
1.3584 | 25.0 | 173825 | 1.3078 |
1.345 | 26.0 | 180778 | 1.3052 |
1.3382 | 27.0 | 187731 | 1.2982 |
1.3349 | 28.0 | 194684 | 1.2906 |
1.3315 | 29.0 | 201637 | 1.2865 |
1.3232 | 30.0 | 208590 | 1.2864 |
Framework versions
- Transformers 4.35.0
- Pytorch 2.1.0+cu118
- Datasets 2.14.6
- Tokenizers 0.14.1
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.