README.md · uvegesistvan/huBERTPlain at fbff7f353ba245fde95ca7b1cbb4df170e86ec87

metadata

license: cc-by-nc-4.0
language:
  - hu
metrics:
  - accuracy
model-index:
  - name: huBERTPlain
    results:
      - task:
          type: text-classification
        metrics:
          - type: accuracy
            value: 0.73

Model description

Cased fine-tuned BERT model for Hungarian, trained on a dataset provided by National Tax and Customs Administration - Hungary (NAV): Public Accessibilty Programme.

Intended uses & limitations

The model can be used as any other (cased) BERT model. It has been tested recognizing "accessible" and "original" sentences, where:

"accessible" - "Label_1": sentence, that can be considered as comprehensible (regarding to Plain Language directives)
"original" - "Label_0": sentence, that needs to rephrased in order to follow Plain Language Guidelines.

Training

Fine-tuned version of the original huBERT model (SZTAKI-HLT/hubert-base-cc), trained on information materials provided by NAV linguistic experts.

Eval results

Class	Precision	Recall	F-Score
Original / Label_0	0.71	0.79	0.75
Accessible / Label_1	0.76	0.67	0.71
accuracy			0.73
macro avg	0.74	0.73	0.73
weighted avg	0.74	0.73	0.73

BibTeX entry and citation info

If you use the model, please cite the following papers:

Bibtex:

@PhDThesis{ Uveges:2023,
  author = {{"U}veges, Istv{\'a}n},
  title  = {A k{\"o}z{\'e}rthet{\"o}s{\'e}g lehet{\"o}s{\'e}gei a jogi dom{\'e}n sz{\"o}vegeiben},
  year   = {2023},
  school = {Szegedi Tudom\'anyegyetem}
}