Edit model card


  
    Model: RoBERTa Large
    Lang: IT
  

Model description

This is a RoBERTa Large [1] model for the Italian language, obtained using XLM-RoBERTa-Large [2] (xlm-roberta-large) as a starting point and focusing it on the italian language by modifying the embedding layer (as in [3], computing document-level frequencies over the Wikipedia dataset)

The resulting model has 356M parameters, a vocabulary of 50.670 tokens, and a size of ~1.42 GB.

Quick usage

from transformers import RobertaTokenizerFast, RobertaForMaskedLM
from transformers import pipeline

tokenizer = RobertaTokenizerFast.from_pretrained("osiria/roberta-large-italian")
model = RobertaForMaskedLM.from_pretrained("osiria/roberta-large-italian")

pipe = pipeline("fill-mask", model=model, tokenizer=tokenizer)

pipe("Milano è una <mask> italiana")

[{'score': 0.9284337759017944,
  'token': 7786,
  'token_str': 'città',
  'sequence': 'Milano è una città italiana'},
 {'score': 0.03296631574630737,
  'token': 26960,
  'token_str': 'capitale',
  'sequence': 'Milano è una capitale italiana'},
 {'score': 0.015821034088730812,
  'token': 8043,
  'token_str': 'provincia',
  'sequence': 'Milano è una provincia italiana'},
 {'score': 0.007335659582167864,
  'token': 18841,
  'token_str': 'regione',
  'sequence': 'Milano è una regione italiana'},
 {'score': 0.006183209829032421,
  'token': 50152,
  'token_str': 'cittadina',
  'sequence': 'Milano è una cittadina italiana'}]

References

[1] https://arxiv.org/abs/1907.11692

[2] https://arxiv.org/abs/1911.02116

[3] https://arxiv.org/abs/2010.05609

License

The model is released under MIT license

Downloads last month
26
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.