yasmineee commited on
Commit
14ed422
1 Parent(s): 880b4df

finetune-NLLB-600M-on-opus100-Ar2En-with-Dora

Browse files
README.md CHANGED
@@ -15,15 +15,14 @@ model-index:
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
  should probably proofread and complete it, then remove this comment. -->
17
 
18
- [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/FinalProject_/NLLB_2/runs/wpd875tt)
19
  # NLLB_DoRA
20
 
21
  This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 1.3271
24
- - Bleu: 32.6656
25
- - Rouge: 0.593
26
- - Gen Len: 17.403
27
 
28
  ## Model description
29
 
@@ -43,11 +42,11 @@ More information needed
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 2e-05
46
- - train_batch_size: 2
47
- - eval_batch_size: 2
48
  - seed: 42
49
  - gradient_accumulation_steps: 4
50
- - total_train_batch_size: 8
51
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
  - lr_scheduler_type: linear
53
  - num_epochs: 3
@@ -56,15 +55,15 @@ The following hyperparameters were used during training:
56
 
57
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Gen Len |
58
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|
59
- | 2.722 | 1.0 | 875 | 1.3916 | 31.7382 | 0.5849 | 17.493 |
60
- | 1.4579 | 2.0 | 1750 | 1.3379 | 32.34 | 0.5931 | 17.3715 |
61
- | 1.4263 | 3.0 | 2625 | 1.3271 | 32.6656 | 0.593 | 17.403 |
62
 
63
 
64
  ### Framework versions
65
 
66
  - PEFT 0.12.0
67
- - Transformers 4.42.3
68
- - Pytorch 2.1.2
69
- - Datasets 2.20.0
70
  - Tokenizers 0.19.1
 
15
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
  should probably proofread and complete it, then remove this comment. -->
17
 
 
18
  # NLLB_DoRA
19
 
20
  This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 1.2708
23
+ - Bleu: 32.802
24
+ - Rouge: 0.6028
25
+ - Gen Len: 17.4444
26
 
27
  ## Model description
28
 
 
42
 
43
  The following hyperparameters were used during training:
44
  - learning_rate: 2e-05
45
+ - train_batch_size: 1
46
+ - eval_batch_size: 1
47
  - seed: 42
48
  - gradient_accumulation_steps: 4
49
+ - total_train_batch_size: 4
50
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
51
  - lr_scheduler_type: linear
52
  - num_epochs: 3
 
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Rouge | Gen Len |
57
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:------:|:-------:|
58
+ | 1.3937 | 1.0 | 2000 | 1.3115 | 32.2196 | 0.5954 | 17.6569 |
59
+ | 1.3309 | 2.0 | 4000 | 1.2781 | 32.6752 | 0.6011 | 17.4931 |
60
+ | 1.3234 | 3.0 | 6000 | 1.2708 | 32.802 | 0.6028 | 17.4444 |
61
 
62
 
63
  ### Framework versions
64
 
65
  - PEFT 0.12.0
66
+ - Transformers 4.44.0
67
+ - Pytorch 2.4.0
68
+ - Datasets 2.21.0
69
  - Tokenizers 0.19.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:43b72100fb62c015d8521db5551793b983f576d26112f4361bd607758a29db7d
3
  size 5044160
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0b5fd1449e34d1864823c7733416e774bc49f2ea7b6da0bb720fd43c9f6c1d06
3
  size 5044160
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:59075c9f30de2258f65f3a346147cf7b9938a3d043779cb7ee39e73a7977dd7e
3
- size 17331274
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2dde13bc0ee889a9225a407b8f8ede4db6eb7baa4da336ce0091f4f2a4351138
3
+ size 17331373
tokenizer_config.json CHANGED
@@ -1864,9 +1864,6 @@
1864
  "bos_token": "<s>",
1865
  "clean_up_tokenization_spaces": true,
1866
  "cls_token": "<s>",
1867
- "device_map": {
1868
- "": 0
1869
- },
1870
  "eos_token": "</s>",
1871
  "legacy_behaviour": false,
1872
  "load_in_8bit": true,
 
1864
  "bos_token": "<s>",
1865
  "clean_up_tokenization_spaces": true,
1866
  "cls_token": "<s>",
 
 
 
1867
  "eos_token": "</s>",
1868
  "legacy_behaviour": false,
1869
  "load_in_8bit": true,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7f9f708fdce4245f8effa48a9c5086e6b0040278c806e4e7a1d1db400d655500
3
- size 5304
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ccc9d592c44930eb6d26a94c0ab38bf33e8e48e25cc537c44befbefb45c5252
3
+ size 5368