metadata

language: en
tags:
  - summarization
license: apache-2.0
datasets:
  - cnn_dailymail
  - xsum
thumbnail: https://huggingface.co/front/thumbnails/distilbart_medium.png

Distilbart-cnn-12-6

Model Details
How to Get Started With the Model
Uses
Risks, Limitations and Biases
Training
Evaluation

Model Details

Model Description:
Developed by: Sam Shleifer
Model Type: Summarization
Language(s): English
License: Apache-2.0
Parent Model: See the BART lage CNN model for more information about the BART large-sized model which is similarly trained on CNN Dailymail dataset.
Resources for more information:
- Bart Document

How to Get Started With the Model

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("sshleifer/distilbart-cnn-12-6")

model = AutoModelForSeq2SeqLM.from_pretrained("sshleifer/distilbart-cnn-12-6")

Uses

Direct Use

This model can be used for text summerzation.

Risks, Limitations and Biases

Limitations

This model makes use of the CNN Dailymail dataset, which is an English-language dataset containing just over 300k unique news articles as written by journalists at CNN and the Daily Mail. The BCP-47 code for English as generally spoken in the United States is en-US and the BCP-47 code for English as generally spoken in the United Kingdom is en-GB. It is unknown if other varieties of English are represented in the data.

Biases

Bordia and Bowman (2019) explore measuring gender bias and debiasing techniques in the CNN / Dailymail dataset, the Penn Treebank, and WikiText-2. They find the CNN / Dailymail dataset to have a slightly lower gender bias based on their metric compared to the other datasets, but still show evidence of gender bias when looking at words such as 'fragile'.

Further information e.g in regards to uses, out-of-scope uses, training procedure for the CNN Dailymail dataset are available within its dataset card.

Training

This checkpoint should be loaded into BartForConditionalGeneration.from_pretrained. See the BART docs for more information.

Evaluation

Metrics for DistilBART models

Model Name	MM Params	Inference Time (MS)	Speedup	Rouge 2	Rouge-L
distilbart-xsum-12-1	222	90	2.54	18.31	33.37
distilbart-xsum-6-6	230	132	1.73	20.92	35.73
distilbart-xsum-12-3	255	106	2.16	21.37	36.39
distilbart-xsum-9-6	268	136	1.68	21.72	36.61
bart-large-xsum (baseline)	406	229	1	21.85	36.50
distilbart-xsum-12-6	306	137	1.68	22.12	36.99
bart-large-cnn (baseline)	406	381	1	21.06	30.63
distilbart-12-3-cnn	255	214	1.78	20.57	30.00
distilbart-12-6-cnn	306	307	1.24	21.26	30.59
distilbart-6-6-cnn	230	182	2.09	20.17	29.70

sshleifer
/

distilbart-cnn-12-6