MatthisHoules commited on
Commit
519fa2c
1 Parent(s): 6942ea1

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - generated_from_trainer
5
+ datasets:
6
+ - break_data
7
+ metrics:
8
+ - bleu
9
+ model-index:
10
+ - name: t5-large-finetuned-break-qdmr-decomposition
11
+ results:
12
+ - task:
13
+ name: Sequence-to-sequence Language Modeling
14
+ type: text2text-generation
15
+ dataset:
16
+ name: break_data
17
+ type: break_data
18
+ config: QDMR
19
+ split: validation
20
+ args: QDMR
21
+ metrics:
22
+ - name: Bleu
23
+ type: bleu
24
+ value: 0.22169382457557757
25
+ ---
26
+
27
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
28
+ should probably proofread and complete it, then remove this comment. -->
29
+
30
+ # t5-large-finetuned-break-qdmr-decomposition
31
+
32
+ This model is a fine-tuned version of [t5-large](https://huggingface.co/t5-large) on the break_data dataset.
33
+ It achieves the following results on the evaluation set:
34
+ - Loss: 0.1729
35
+ - Bleu: 0.2217
36
+ - Precisions: [0.928997558602713, 0.8089017135403285, 0.702859772673759, 0.6237525532535746]
37
+ - Brevity Penalty: 0.2926
38
+ - Length Ratio: 0.4487
39
+ - Translation Length: 108954
40
+ - Reference Length: 242845
41
+
42
+ ## Model description
43
+
44
+ More information needed
45
+
46
+ ## Intended uses & limitations
47
+
48
+ More information needed
49
+
50
+ ## Training and evaluation data
51
+
52
+ More information needed
53
+
54
+ ## Training procedure
55
+
56
+ ### Training hyperparameters
57
+
58
+ The following hyperparameters were used during training:
59
+ - learning_rate: 0.0001
60
+ - train_batch_size: 2
61
+ - eval_batch_size: 2
62
+ - seed: 42
63
+ - gradient_accumulation_steps: 64
64
+ - total_train_batch_size: 128
65
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
66
+ - lr_scheduler_type: linear
67
+ - num_epochs: 10
68
+
69
+ ### Training results
70
+
71
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length |
72
+ |:-------------:|:-----:|:----:|:---------------:|:------:|:--------------------------------------------------------------------------------:|:---------------:|:------------:|:------------------:|:----------------:|
73
+ | No log | 1.0 | 346 | 0.2217 | 0.2190 | [0.9212396799650076, 0.7929651493459373, 0.6788405612515656, 0.5938190356122556] | 0.2973 | 0.4519 | 109738 | 242845 |
74
+ | 0.3597 | 2.0 | 692 | 0.1898 | 0.2213 | [0.9278319373884388, 0.8053505444154309, 0.6955454787943451, 0.6142312076867599] | 0.2944 | 0.4499 | 109245 | 242845 |
75
+ | 0.1943 | 3.0 | 1038 | 0.1780 | 0.2213 | [0.9274868270332188, 0.805860010851872, 0.6987019924149351, 0.6179670572886331] | 0.2936 | 0.4494 | 109125 | 242845 |
76
+ | 0.1943 | 4.0 | 1385 | 0.1722 | 0.2209 | [0.9296421064226247, 0.8077246177717601, 0.6996456975263051, 0.618521199103474] | 0.2926 | 0.4486 | 108943 | 242845 |
77
+ | 0.1588 | 5.0 | 1731 | 0.1708 | 0.2221 | [0.9263551333376084, 0.8062900028599888, 0.7016414100962206, 0.6226711690731253] | 0.2938 | 0.4495 | 109159 | 242845 |
78
+ | 0.1395 | 6.0 | 2077 | 0.1699 | 0.2209 | [0.9307313480922355, 0.8116381660470879, 0.7052247221178113, 0.6255682084446319] | 0.2907 | 0.4473 | 108635 | 242845 |
79
+ | 0.1395 | 7.0 | 2423 | 0.1699 | 0.2219 | [0.9294629418890643, 0.8099284613256393, 0.7035550704165061, 0.623971523603898] | 0.2927 | 0.4487 | 108964 | 242845 |
80
+ | 0.1245 | 8.0 | 2770 | 0.1717 | 0.2215 | [0.9293905921457364, 0.8091923795588686, 0.7026416387368962, 0.6239635641714353] | 0.2924 | 0.4485 | 108909 | 242845 |
81
+ | 0.1152 | 9.0 | 3116 | 0.1724 | 0.2215 | [0.9294489230034706, 0.8091424956007671, 0.7027003876051995, 0.6234366789280084] | 0.2924 | 0.4485 | 108914 | 242845 |
82
+ | 0.1152 | 9.99 | 3460 | 0.1729 | 0.2217 | [0.928997558602713, 0.8089017135403285, 0.702859772673759, 0.6237525532535746] | 0.2926 | 0.4487 | 108954 | 242845 |
83
+
84
+
85
+ ### Framework versions
86
+
87
+ - Transformers 4.30.2
88
+ - Pytorch 2.0.1+cu118
89
+ - Datasets 2.13.1
90
+ - Tokenizers 0.13.3