distilroberta-rbm213k-ep40 / README.md

judy93536

End of training

a8eb417 11 months ago

preview code

raw

history blame contribute delete

No virus

3.47 kB

	---
	license: apache-2.0
	base_model: distilroberta-base
	tags:
	- generated_from_trainer
	model-index:
	- name: distilroberta-base-rbm213k-ep40
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# distilroberta-base-rbm213k-ep40

	This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2136

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 7.3e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.19
	- num_epochs: 40
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|
	\| 1.7375 \| 1.0 \| 13587 \| 1.5878 \|
	\| 1.6433 \| 2.0 \| 27174 \| 1.5049 \|
	\| 1.5936 \| 3.0 \| 40761 \| 1.4690 \|
	\| 1.5591 \| 4.0 \| 54348 \| 1.4469 \|
	\| 1.5452 \| 5.0 \| 67935 \| 1.4324 \|
	\| 1.5378 \| 6.0 \| 81522 \| 1.4267 \|
	\| 1.5333 \| 7.0 \| 95109 \| 1.4279 \|
	\| 1.5248 \| 8.0 \| 108696 \| 1.4250 \|
	\| 1.5095 \| 9.0 \| 122283 \| 1.4120 \|
	\| 1.4906 \| 10.0 \| 135870 \| 1.3937 \|
	\| 1.4746 \| 11.0 \| 149457 \| 1.3860 \|
	\| 1.4565 \| 12.0 \| 163044 \| 1.3730 \|
	\| 1.4377 \| 13.0 \| 176631 \| 1.3672 \|
	\| 1.4222 \| 14.0 \| 190218 \| 1.3581 \|
	\| 1.415 \| 15.0 \| 203805 \| 1.3501 \|
	\| 1.4148 \| 16.0 \| 217392 \| 1.3422 \|
	\| 1.404 \| 17.0 \| 230979 \| 1.3356 \|
	\| 1.3925 \| 18.0 \| 244566 \| 1.3296 \|
	\| 1.3782 \| 19.0 \| 258153 \| 1.3207 \|
	\| 1.3655 \| 20.0 \| 271740 \| 1.3185 \|
	\| 1.3628 \| 21.0 \| 285327 \| 1.3109 \|
	\| 1.355 \| 22.0 \| 298914 \| 1.3046 \|
	\| 1.3455 \| 23.0 \| 312501 \| 1.2985 \|
	\| 1.3365 \| 24.0 \| 326088 \| 1.2900 \|
	\| 1.3279 \| 25.0 \| 339675 \| 1.2858 \|
	\| 1.321 \| 26.0 \| 353262 \| 1.2811 \|
	\| 1.3198 \| 27.0 \| 366849 \| 1.2746 \|
	\| 1.3042 \| 28.0 \| 380436 \| 1.2682 \|
	\| 1.3057 \| 29.0 \| 394023 \| 1.2641 \|
	\| 1.2885 \| 30.0 \| 407610 \| 1.2584 \|
	\| 1.2841 \| 31.0 \| 421197 \| 1.2507 \|
	\| 1.2776 \| 32.0 \| 434784 \| 1.2443 \|
	\| 1.2688 \| 33.0 \| 448371 \| 1.2415 \|
	\| 1.2658 \| 34.0 \| 461958 \| 1.2344 \|
	\| 1.2606 \| 35.0 \| 475545 \| 1.2315 \|
	\| 1.2445 \| 36.0 \| 489132 \| 1.2258 \|
	\| 1.2409 \| 37.0 \| 502719 \| 1.2211 \|
	\| 1.2478 \| 38.0 \| 516306 \| 1.2187 \|
	\| 1.236 \| 39.0 \| 529893 \| 1.2138 \|
	\| 1.2277 \| 40.0 \| 543480 \| 1.2124 \|


	### Framework versions

	- Transformers 4.35.2
	- Pytorch 2.1.0+cu118
	- Datasets 2.15.0
	- Tokenizers 0.15.0

	---
	license: apache-2.0
	base_model: distilroberta-base
	tags:
	- generated_from_trainer
	model-index:
	- name: distilroberta-base-rbm213k-ep40
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# distilroberta-base-rbm213k-ep40

	This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.2136

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 7.3e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_ratio: 0.19
	- num_epochs: 40
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|
	\| 1.7375 \| 1.0 \| 13587 \| 1.5878 \|
	\| 1.6433 \| 2.0 \| 27174 \| 1.5049 \|
	\| 1.5936 \| 3.0 \| 40761 \| 1.4690 \|
	\| 1.5591 \| 4.0 \| 54348 \| 1.4469 \|
	\| 1.5452 \| 5.0 \| 67935 \| 1.4324 \|
	\| 1.5378 \| 6.0 \| 81522 \| 1.4267 \|
	\| 1.5333 \| 7.0 \| 95109 \| 1.4279 \|
	\| 1.5248 \| 8.0 \| 108696 \| 1.4250 \|
	\| 1.5095 \| 9.0 \| 122283 \| 1.4120 \|
	\| 1.4906 \| 10.0 \| 135870 \| 1.3937 \|
	\| 1.4746 \| 11.0 \| 149457 \| 1.3860 \|
	\| 1.4565 \| 12.0 \| 163044 \| 1.3730 \|
	\| 1.4377 \| 13.0 \| 176631 \| 1.3672 \|
	\| 1.4222 \| 14.0 \| 190218 \| 1.3581 \|
	\| 1.415 \| 15.0 \| 203805 \| 1.3501 \|
	\| 1.4148 \| 16.0 \| 217392 \| 1.3422 \|
	\| 1.404 \| 17.0 \| 230979 \| 1.3356 \|
	\| 1.3925 \| 18.0 \| 244566 \| 1.3296 \|
	\| 1.3782 \| 19.0 \| 258153 \| 1.3207 \|
	\| 1.3655 \| 20.0 \| 271740 \| 1.3185 \|
	\| 1.3628 \| 21.0 \| 285327 \| 1.3109 \|
	\| 1.355 \| 22.0 \| 298914 \| 1.3046 \|
	\| 1.3455 \| 23.0 \| 312501 \| 1.2985 \|
	\| 1.3365 \| 24.0 \| 326088 \| 1.2900 \|
	\| 1.3279 \| 25.0 \| 339675 \| 1.2858 \|
	\| 1.321 \| 26.0 \| 353262 \| 1.2811 \|
	\| 1.3198 \| 27.0 \| 366849 \| 1.2746 \|
	\| 1.3042 \| 28.0 \| 380436 \| 1.2682 \|
	\| 1.3057 \| 29.0 \| 394023 \| 1.2641 \|
	\| 1.2885 \| 30.0 \| 407610 \| 1.2584 \|
	\| 1.2841 \| 31.0 \| 421197 \| 1.2507 \|
	\| 1.2776 \| 32.0 \| 434784 \| 1.2443 \|
	\| 1.2688 \| 33.0 \| 448371 \| 1.2415 \|
	\| 1.2658 \| 34.0 \| 461958 \| 1.2344 \|
	\| 1.2606 \| 35.0 \| 475545 \| 1.2315 \|
	\| 1.2445 \| 36.0 \| 489132 \| 1.2258 \|
	\| 1.2409 \| 37.0 \| 502719 \| 1.2211 \|
	\| 1.2478 \| 38.0 \| 516306 \| 1.2187 \|
	\| 1.236 \| 39.0 \| 529893 \| 1.2138 \|
	\| 1.2277 \| 40.0 \| 543480 \| 1.2124 \|


	### Framework versions

	- Transformers 4.35.2
	- Pytorch 2.1.0+cu118
	- Datasets 2.15.0
	- Tokenizers 0.15.0