mBART based translation model

This model was trained to translate multiple sentences at once, compared to one sentence at a time.

It will occasionally combine sentences or add an extra sentence.

This is the same model as are provided on CLARIN: https://repository.clarin.is/repository/xmlui/handle/20.500.12537/278

You can use the following example to get started:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline
import torch

device = torch.cuda.current_device() if torch.cuda.is_available() else -1

tokenizer = AutoTokenizer.from_pretrained("mideind/nmt-doc-en-is-2022-10",src_lang="en_XX",tgt_lang="is_IS")

model = AutoModelForSeq2SeqLM.from_pretrained("mideind/nmt-doc-en-is-2022-10")

translate = pipeline("translation_XX_to_YY",model=model,tokenizer=tokenizer,device=device,src_lang="en_XX",tgt_lang="is_IS")

target_seq = translate("I am using a translation model to translate text from English to Icelandic.",src_lang="en_XX",tgt_lang="is_IS",max_length=128)
print(target_seq[0]['translation_text'].strip('YY '))