e3_lr2e-05 / README.md

lailamt

Model save

e6d7d7d verified 4 months ago

preview code

raw

history blame

No virus

3.56 kB

	---
	license: mit
	base_model: neuralmind/bert-base-portuguese-cased
	tags:
	- generated_from_trainer
	model-index:
	- name: e3_lr2e-05
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# e3_lr2e-05

	This model is a fine-tuned version of [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.5721

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 2.2771 \| 0.0707 \| 100 \| 1.9875 \|
	\| 2.0486 \| 0.1414 \| 200 \| 1.8946 \|
	\| 1.993 \| 0.2121 \| 300 \| 1.8415 \|
	\| 1.9532 \| 0.2828 \| 400 \| 1.8133 \|
	\| 1.9145 \| 0.3535 \| 500 \| 1.7807 \|
	\| 1.8872 \| 0.4242 \| 600 \| 1.7534 \|
	\| 1.8593 \| 0.4949 \| 700 \| 1.7357 \|
	\| 1.8447 \| 0.5656 \| 800 \| 1.7173 \|
	\| 1.8149 \| 0.6363 \| 900 \| 1.7074 \|
	\| 1.7966 \| 0.7070 \| 1000 \| 1.7036 \|
	\| 1.8034 \| 0.7777 \| 1100 \| 1.6883 \|
	\| 1.7854 \| 0.8484 \| 1200 \| 1.6740 \|
	\| 1.7779 \| 0.9191 \| 1300 \| 1.6642 \|
	\| 1.7706 \| 0.9897 \| 1400 \| 1.6582 \|
	\| 1.7723 \| 1.0604 \| 1500 \| 1.6475 \|
	\| 1.746 \| 1.1311 \| 1600 \| 1.6463 \|
	\| 1.7386 \| 1.2018 \| 1700 \| 1.6399 \|
	\| 1.7319 \| 1.2725 \| 1800 \| 1.6385 \|
	\| 1.7292 \| 1.3432 \| 1900 \| 1.6230 \|
	\| 1.7121 \| 1.4139 \| 2000 \| 1.6204 \|
	\| 1.7245 \| 1.4846 \| 2100 \| 1.6152 \|
	\| 1.7159 \| 1.5553 \| 2200 \| 1.6103 \|
	\| 1.7232 \| 1.6260 \| 2300 \| 1.6114 \|
	\| 1.6952 \| 1.6967 \| 2400 \| 1.6099 \|
	\| 1.6944 \| 1.7674 \| 2500 \| 1.6012 \|
	\| 1.6991 \| 1.8381 \| 2600 \| 1.5970 \|
	\| 1.6954 \| 1.9088 \| 2700 \| 1.5933 \|
	\| 1.698 \| 1.9795 \| 2800 \| 1.5918 \|
	\| 1.6857 \| 2.0502 \| 2900 \| 1.5915 \|
	\| 1.6783 \| 2.1209 \| 3000 \| 1.5840 \|
	\| 1.679 \| 2.1916 \| 3100 \| 1.5817 \|
	\| 1.6796 \| 2.2623 \| 3200 \| 1.5835 \|
	\| 1.6709 \| 2.3330 \| 3300 \| 1.5769 \|
	\| 1.6626 \| 2.4037 \| 3400 \| 1.5819 \|
	\| 1.6732 \| 2.4744 \| 3500 \| 1.5824 \|
	\| 1.6726 \| 2.5458 \| 3600 \| 1.5720 \|
	\| 1.6822 \| 2.6165 \| 3700 \| 1.5758 \|
	\| 1.6578 \| 2.6872 \| 3800 \| 1.5739 \|
	\| 1.6756 \| 2.7579 \| 3900 \| 1.5743 \|
	\| 1.6747 \| 2.8286 \| 4000 \| 1.5695 \|
	\| 1.659 \| 2.8993 \| 4100 \| 1.5713 \|
	\| 1.6587 \| 2.9700 \| 4200 \| 1.5750 \|


	### Framework versions

	- Transformers 4.41.2
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.2
	- Tokenizers 0.19.1

	---
	license: mit
	base_model: neuralmind/bert-base-portuguese-cased
	tags:
	- generated_from_trainer
	model-index:
	- name: e3_lr2e-05
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# e3_lr2e-05

	This model is a fine-tuned version of [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.5721

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 8
	- eval_batch_size: 16
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 2.2771 \| 0.0707 \| 100 \| 1.9875 \|
	\| 2.0486 \| 0.1414 \| 200 \| 1.8946 \|
	\| 1.993 \| 0.2121 \| 300 \| 1.8415 \|
	\| 1.9532 \| 0.2828 \| 400 \| 1.8133 \|
	\| 1.9145 \| 0.3535 \| 500 \| 1.7807 \|
	\| 1.8872 \| 0.4242 \| 600 \| 1.7534 \|
	\| 1.8593 \| 0.4949 \| 700 \| 1.7357 \|
	\| 1.8447 \| 0.5656 \| 800 \| 1.7173 \|
	\| 1.8149 \| 0.6363 \| 900 \| 1.7074 \|
	\| 1.7966 \| 0.7070 \| 1000 \| 1.7036 \|
	\| 1.8034 \| 0.7777 \| 1100 \| 1.6883 \|
	\| 1.7854 \| 0.8484 \| 1200 \| 1.6740 \|
	\| 1.7779 \| 0.9191 \| 1300 \| 1.6642 \|
	\| 1.7706 \| 0.9897 \| 1400 \| 1.6582 \|
	\| 1.7723 \| 1.0604 \| 1500 \| 1.6475 \|
	\| 1.746 \| 1.1311 \| 1600 \| 1.6463 \|
	\| 1.7386 \| 1.2018 \| 1700 \| 1.6399 \|
	\| 1.7319 \| 1.2725 \| 1800 \| 1.6385 \|
	\| 1.7292 \| 1.3432 \| 1900 \| 1.6230 \|
	\| 1.7121 \| 1.4139 \| 2000 \| 1.6204 \|
	\| 1.7245 \| 1.4846 \| 2100 \| 1.6152 \|
	\| 1.7159 \| 1.5553 \| 2200 \| 1.6103 \|
	\| 1.7232 \| 1.6260 \| 2300 \| 1.6114 \|
	\| 1.6952 \| 1.6967 \| 2400 \| 1.6099 \|
	\| 1.6944 \| 1.7674 \| 2500 \| 1.6012 \|
	\| 1.6991 \| 1.8381 \| 2600 \| 1.5970 \|
	\| 1.6954 \| 1.9088 \| 2700 \| 1.5933 \|
	\| 1.698 \| 1.9795 \| 2800 \| 1.5918 \|
	\| 1.6857 \| 2.0502 \| 2900 \| 1.5915 \|
	\| 1.6783 \| 2.1209 \| 3000 \| 1.5840 \|
	\| 1.679 \| 2.1916 \| 3100 \| 1.5817 \|
	\| 1.6796 \| 2.2623 \| 3200 \| 1.5835 \|
	\| 1.6709 \| 2.3330 \| 3300 \| 1.5769 \|
	\| 1.6626 \| 2.4037 \| 3400 \| 1.5819 \|
	\| 1.6732 \| 2.4744 \| 3500 \| 1.5824 \|
	\| 1.6726 \| 2.5458 \| 3600 \| 1.5720 \|
	\| 1.6822 \| 2.6165 \| 3700 \| 1.5758 \|
	\| 1.6578 \| 2.6872 \| 3800 \| 1.5739 \|
	\| 1.6756 \| 2.7579 \| 3900 \| 1.5743 \|
	\| 1.6747 \| 2.8286 \| 4000 \| 1.5695 \|
	\| 1.659 \| 2.8993 \| 4100 \| 1.5713 \|
	\| 1.6587 \| 2.9700 \| 4200 \| 1.5750 \|


	### Framework versions

	- Transformers 4.41.2
	- Pytorch 2.3.0+cu121
	- Datasets 2.19.2
	- Tokenizers 0.19.1