--- tags: - translation license: apache-2.0 --- # opus-mt-it-en ## Table of Contents - [Model Details](#model-details) - [How to Get Started With the Model](#how-to-get-started-with-the-model) - [Uses](#uses) - [Risks, Limitations and Biases](#risks-limitations-and-biases) - [Training](#training) - [Evaluation](#evaluation) ## Model Details **Model Description:** - **Developed by:** [Language Technology Research Group at the University of Helsinki](https://blogs.helsinki.fi/language-technology/) - **Model Type:** transformer-align - **Language(s):** - Source Language: Italian - Target Language: English - **License:** apache-2.0 - **Resources for more information:** - [GitHub Repo](https://github.com/Helsinki-NLP/OPUS-MT-train) ## How to Get Started With the Model ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-it-en") model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-it-en") ``` ## Uses #### Direct Use This model can be used for translation and text-to-text generation. ## Risks, Limitations and Biases **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.** Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Further details about the dataset for this model can be found in the OPUS readme: [it-en](https://github.com/Helsinki-NLP/OPUS-MT-train/blob/master/models/it-en/README.md) #### Training Data ##### Preprocessing * **Pre-processing:** Normalization + SentencePiece * **Dataset:** [opus](https://github.com/Helsinki-NLP/Opus-MT) * **Download original weights:** [opus-2019-12-18.zip](https://object.pouta.csc.fi/OPUS-MT-models/it-en/opus-2019-12-18.zip) * **Test set translations:** [opus-2019-12-18.test.txt](https://object.pouta.csc.fi/OPUS-MT-models/it-en/opus-2019-12-18.test.txt) ## Evaluation ### Results * **Test set scores:** [opus-2019-12-18.eval.txt](https://object.pouta.csc.fi/OPUS-MT-models/it-en/opus-2019-12-18.eval.txt) #### Benchmarks | testset | BLEU | chr-F | |-----------------------|-------|-------| | newssyscomb2009.it.en | 35.3 | 0.600 | | newstest2009.it.en | 34.0 | 0.594 | | Tatoeba.it.en | 70.9 | 0.808 |