harouzie commited on
Commit
7a2fcd0
1 Parent(s): 74525d2

End of training

Browse files
Files changed (2) hide show
  1. README.md +8 -17
  2. pytorch_model.bin +1 -1
README.md CHANGED
@@ -4,18 +4,9 @@ tags:
4
  - generated_from_trainer
5
  metrics:
6
  - bleu
7
- - sacrebleu
8
  model-index:
9
  - name: mbart-translation-en2vi
10
  results: []
11
- license: mit
12
- datasets:
13
- - harouzie/vi_en-translation
14
- language:
15
- - vi
16
- - en
17
- library_name: transformers
18
- pipeline_tag: translation
19
  ---
20
 
21
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -25,9 +16,9 @@ should probably proofread and complete it, then remove this comment. -->
25
 
26
  This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset.
27
  It achieves the following results on the evaluation set:
28
- - Loss: 0.6752
29
- - Bleu: 55.2256
30
- - Gen Len: 11.9724
31
 
32
  ## Model description
33
 
@@ -47,11 +38,11 @@ More information needed
47
 
48
  The following hyperparameters were used during training:
49
  - learning_rate: 2e-05
50
- - train_batch_size: 4
51
- - eval_batch_size: 4
52
  - seed: 42
53
  - gradient_accumulation_steps: 4
54
- - total_train_batch_size: 16
55
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
56
  - lr_scheduler_type: linear
57
  - num_epochs: 1
@@ -60,7 +51,7 @@ The following hyperparameters were used during training:
60
 
61
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
62
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
63
- | No log | 1.0 | 127 | 0.6752 | 55.2256 | 11.9724 |
64
 
65
 
66
  ### Framework versions
@@ -68,4 +59,4 @@ The following hyperparameters were used during training:
68
  - Transformers 4.33.0
69
  - Pytorch 2.0.0
70
  - Datasets 2.1.0
71
- - Tokenizers 0.13.3
 
4
  - generated_from_trainer
5
  metrics:
6
  - bleu
 
7
  model-index:
8
  - name: mbart-translation-en2vi
9
  results: []
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
16
 
17
  This model is a fine-tuned version of [facebook/mbart-large-50-many-to-many-mmt](https://huggingface.co/facebook/mbart-large-50-many-to-many-mmt) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.4754
20
+ - Bleu: 65.4567
21
+ - Gen Len: 12.1165
22
 
23
  ## Model description
24
 
 
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 2e-05
41
+ - train_batch_size: 8
42
+ - eval_batch_size: 8
43
  - seed: 42
44
  - gradient_accumulation_steps: 4
45
+ - total_train_batch_size: 32
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
  - num_epochs: 1
 
51
 
52
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
53
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
54
+ | 0.6354 | 1.0 | 635 | 0.4754 | 65.4567 | 12.1165 |
55
 
56
 
57
  ### Framework versions
 
59
  - Transformers 4.33.0
60
  - Pytorch 2.0.0
61
  - Datasets 2.1.0
62
+ - Tokenizers 0.13.3
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:23dd708c237decd70d17c07a2af66a7feeda6c32897dd1d069c56fb466935929
3
  size 2444694045
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4b667d48d88327a3319db380af7ef7dc5472f39eb12d96881f9459b3989baa89
3
  size 2444694045