harouzie
/

mbart-translation-en2vi

text2text-generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

harouzie commited on Sep 8, 2023

Commit

7a2fcd0

•

1 Parent(s): 74525d2

End of training

Files changed (2) hide show

README.md +8 -17
pytorch_model.bin +1 -1

README.md CHANGED Viewed

@@ -4,18 +4,9 @@ tags:
 - generated_from_trainer
 metrics:
 - bleu
-- sacrebleu
 model-index:
 - name: mbart-translation-en2vi
   results: []
-license: mit
-datasets:
-- harouzie/vi_en-translation
-language:
-- vi
-- en
-library_name: transformers
-pipeline_tag: translation
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -25,9 +16,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6752
-- Bleu: 55.2256
-- Gen Len: 11.9724
 ## Model description
@@ -47,11 +38,11 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
-- train_batch_size: 4
-- eval_batch_size: 4
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 16
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 1
@@ -60,7 +51,7 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
-| No log        | 1.0   | 127  | 0.6752          | 55.2256 | 11.9724 |
 ### Framework versions
@@ -68,4 +59,4 @@ The following hyperparameters were used during training:
 - Transformers 4.33.0
 - Pytorch 2.0.0
 - Datasets 2.1.0
-- Tokenizers 0.13.3

 - generated_from_trainer
 metrics:
 - bleu
 model-index:
 - name: mbart-translation-en2vi
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.4754
+- Bleu: 65.4567
+- Gen Len: 12.1165
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 2e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 1
 | Training Loss | Epoch | Step | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
+| 0.6354        | 1.0   | 635  | 0.4754          | 65.4567 | 12.1165 |
 ### Framework versions
 - Transformers 4.33.0
 - Pytorch 2.0.0
 - Datasets 2.1.0
+- Tokenizers 0.13.3

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:23dd708c237decd70d17c07a2af66a7feeda6c32897dd1d069c56fb466935929
 size 2444694045

 version https://git-lfs.github.com/spec/v1
+oid sha256:4b667d48d88327a3319db380af7ef7dc5472f39eb12d96881f9459b3989baa89
 size 2444694045