|
--- |
|
license: apache-2.0 |
|
language: |
|
- hu |
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: huBERTPlain |
|
results: |
|
- task: |
|
type: text-classification |
|
metrics: |
|
- type: f1 |
|
value: 0.77 |
|
--- |
|
|
|
## Model description |
|
|
|
Cased fine-tuned BERT model for Hungarian, trained on (manuallay anniated) parliamentary pre-agenda speeches scraped from `parlament.hu`. |
|
|
|
## Intended uses & limitations |
|
|
|
The model can be used as any other (cased) BERT model. It has been tested recognizing emotions at the sentence level in (parliamentary) pre-agenda speeches, where: |
|
* 'Label_0': Neutral |
|
* 'Label_1': Fear |
|
* 'Label_3': Sadness |
|
* 'Label_4': Anger |
|
* 'Label_5': Disgust |
|
* 'Label_6': Success |
|
* 'Label_7': Joy |
|
|
|
## Training |
|
|
|
Fine-tuned version of the original huBERT model (`SZTAKI-HLT/hubert-base-cc`), trained on HunEmPoli corpus. |
|
|
|
## Eval results |
|
|
|
| Class | Precision | Recall | F-Score | |
|
|-----|------------|------------|------| |
|
| Fear | 0.625 | 0.625 | 0.625 | |
|
| Sadness | 0.8535 | 0.6291 | 0.7243 | |
|
| Anger | 0.7857 | 0.3437 | 0.4782 | |
|
| Disgust | 0.7154 | 0.8790 | 0.7888 | |
|
| Success | 0.8579 | 0.8683 | 0.8631 | |
|
| Joy | 0.549 | 0.6363 | 0.5894 | |
|
| Trust | 0.4705 | 0.5581 | 0.5106 | |
|
| Macro AVG | 0.7134 | 0.6281 | 0.6497 | |
|
| Weighted AVG | 0.791 | 0.7791 | 0.7743 | |
|
|
|
|
|
## Usage |
|
|
|
```py |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("poltextlab/HunEmBERT8") |
|
model = AutoModelForSequenceClassification.from_pretrained("poltextlab/HunEmBERT8") |
|
``` |
|
|
|
### BibTeX entry and citation info |
|
|
|
If you use the model, please cite the following paper: |
|
|
|
Bibtex: |
|
```bibtex |
|
@{ |
|
} |
|
``` |