harouzie commited on
Commit
491e886
1 Parent(s): 6ce2957

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -1
README.md CHANGED
@@ -8,4 +8,118 @@ metrics:
8
  - rouge
9
  library_name: transformers
10
  pipeline_tag: summarization
11
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  - rouge
9
  library_name: transformers
10
  pipeline_tag: summarization
11
+ ---
12
+
13
+ # BART (base-sized model) fine-tuned on `xsum`
14
+
15
+ **Disclaimer**: This [`bart-base`](https://huggingface.co/facebook/bart-base) model only fine-tuned on a small part of [`xsum`](https://huggingface.co/datasets/xsum) dataset. Due to lack of resources, using a P100 GPU, we trained it with different stages and data. The progress is described as below. You can train this model on more data before use it.
16
+
17
+ ## Model description
18
+
19
+ BART has achieved state-of-the-art results on the CNN/Daily Mail and XSum datasets for summarization tasks.
20
+
21
+ - On the CNN/Daily Mail dataset, BART achieved a `ROUGE-2` score of 21.28, which is the highest reported score on this dataset as of September 2021. The previous state-of-the-art model, [`google/PEGASUS`](https://huggingface.co/google/pegasus-xsum), achieved a `ROUGE-2` score of 21.00. BART also achieved state-of-the-art results on several other metrics such as `ROUGE-1` and `ROUGE-L`.
22
+
23
+ - On the XSum dataset, BART achieved a `ROUGE-2` score of 16.83, which is the highest reported score on this dataset as of September 2021. The previous state-of-the-art model, T5, achieved a `ROUGE-2` score of 16.32. BART also achieved state-of-the-art results on several other metrics such as `ROUGE-1` and `ROUGE-L`.
24
+
25
+ BART SOTA on CNN/DM
26
+
27
+ ```m
28
+ {
29
+ 'eval_rouge1': 44.16,
30
+ 'eval_rouge2': 21.28,
31
+ 'eval_rougeL': 40.90
32
+ }
33
+ ```
34
+
35
+ BART SOTA on XSum
36
+
37
+ ```m
38
+ {
39
+ 'eval_rouge1': 45.14,
40
+ 'eval_rouge2': 22.27,
41
+ 'eval_rougeL': 37.25
42
+ }
43
+ ```
44
+
45
+ ## Training Strategy
46
+
47
+ ### **First round**
48
+
49
+ At first, we tested GPU memory with first 10k samples and batch_size of 16
50
+
51
+ Data: train/test/validation[10000:1000:1000] \
52
+ Epoch: 3
53
+
54
+ Evaluation:
55
+
56
+ ```m
57
+ {
58
+ 'eval_loss': 3.34855318069458,
59
+ 'eval_rouge1': 35.1931,
60
+ 'eval_rouge2': 13.7162,
61
+ 'eval_rougeL': 28.4343,
62
+ 'eval_rougeLsum': 28.4329,
63
+ 'eval_gen_len': 19.58,
64
+ 'eval_runtime': 111.2625,
65
+ 'eval_samples_per_second': 8.988,
66
+ 'eval_steps_per_second': 2.247,
67
+ 'epoch': 3.0
68
+ }
69
+ ```
70
+
71
+ ### **Second round**
72
+
73
+ In the second round, we doubled everything by picking next 20k samples (no overlapping with first 10k) and the same batch_size of 16, also increase epoch to 5
74
+
75
+ Data: train/test/validation split[20000:2000:2000] \
76
+ Epoch: 5
77
+
78
+ Evaluation:
79
+
80
+ ```m
81
+ {
82
+ 'eval_loss': 3.2764062881469727,
83
+ 'eval_rouge1': 36.4663,
84
+ 'eval_rouge2': 15.1419,
85
+ 'eval_rougeL': 30.0491,
86
+ 'eval_rougeLsum': 30.0254,
87
+ 'eval_gen_len': 19.619,
88
+ 'eval_runtime': 217.6418,
89
+ 'eval_samples_per_second': 9.189,
90
+ 'eval_steps_per_second': 2.297,
91
+ 'epoch': 5.0
92
+ }
93
+ ```
94
+
95
+ Our draft training seems converged but has not achieved the SOTA point stated in the paper yet. Stay tuned for round 3
96
+
97
+ ## How to use
98
+
99
+ Here is how to use and start fine-tuning this model on more data:
100
+
101
+ ```python
102
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
103
+ from transformers import pipeline
104
+
105
+ checkpoint = 'harouzie/bart-base-xsum'
106
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
107
+ model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint)
108
+
109
+ # this bit of news link was cited from CNN: https://edition.cnn.com/2023/03/18/americas/ecuador-earthquake
110
+ news = """
111
+ At least 13 people died after a magnitude 6.8 earthquake struck southern Ecuador on Saturday afternoon, according to government officials.
112
+
113
+ The earthquake struck near the southern town of Baláo and was more than 65 km (nearly 41 miles) deep, according to the United States Geological Survey.
114
+
115
+ An estimated 461 people were injured in the quake, according to a report from the Ecuadorian president’s office. The government had previously reported that 16 people were killed but later revised the death toll.
116
+
117
+ In the province of El Oro, at least 11 people died. At least one other death was reported in the province of Azuay, according to the communications department for Ecuador’s president. In an earlier statement, authorities said the person in Azuay was killed when a wall collapsed onto a car and that at least three of the victims in El Oro died when a security camera tower came down.
118
+ """
119
+
120
+ summarizer = pipeline(task="summarization", model=model, tokenizer=tokenizer)
121
+
122
+ summarizer(news)
123
+ ```
124
+
125
+ > `>>>[{'summary_text': 'At least 13 people have been killed and more than 500 injured in an earthquake in Ecuador, officials say.'}]`