uvegesistvan/Hun_RoBERTa_Plain

Model description

Fine-tuned xlm-RoBERTa model for Hungarian, trained on a dataset provided by National Tax and Customs Administration - Hungary (NAV): Public Accessibilty Programme.

Intended uses & limitations

The model can be used as any other xlm-RoBERTa model. It has been tested recognizing "accessible" and "original" sentences, where:

"accessible" - "Label_1": sentence, that can be considered as comprehensible (regarding to Plain Language directives)
"original" - "Label_0": sentence, that needs to rephrased in order to follow Plain Language Guidelines.

Training

Fine-tuned version of the xlm-RoBERTa model (FacebookAI/xlm-roberta-base), trained on information materials provided by NAV linguistic experts.

Eval results

Class	Precision	Recall	F-Score
Original / Label_0	0.76	0.71	0.73
Accessible / Label_1	0.72	0.78	0.75
accuracy			0.74
macro avg	0.74	0.74	0.74
weighted avg	0.74	0.74	0.74

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("uvegesistvan/Hun_RoBERTa_Plain")
model = AutoModelForSequenceClassification.from_pretrained("uvegesistvan/Hun_RoBERTa_Plain")

uvegesistvan
/

Hun_RoBERTa_Plain

Model description

Intended uses & limitations

Training

Eval results

Usage

Evaluation results