--- license: apache-2.0 language: - hu metrics: - accuracy model-index: - name: huBERTPlain results: - task: type: text-classification metrics: - type: f1 value: 0.77 --- ## Model description Cased fine-tuned BERT model for Hungarian, trained on (manuallay anniated) parliamentary pre-agenda speeches scraped from `parlament.hu`. ## Intended uses & limitations The model can be used as any other (cased) BERT model. It has been tested recognizing emotions at the sentence level in (parliamentary) pre-agenda speeches, where: * 'Label_0': Neutral * 'Label_1': Fear * 'Label_3': Sadness * 'Label_4': Anger * 'Label_5': Disgust * 'Label_6': Success * 'Label_7': Joy ## Training Fine-tuned version of the original huBERT model (`SZTAKI-HLT/hubert-base-cc`), trained on HunEmPoli corpus. ## Eval results | Class | Precision | Recall | F-Score | |-----|------------|------------|------| | Fear | 0.625 | 0.625 | 0.625 | | Sadness | 0.8535 | 0.6291 | 0.7243 | | Anger | 0.7857 | 0.3437 | 0.4782 | | Disgust | 0.7154 | 0.8790 | 0.7888 | | Success | 0.8579 | 0.8683 | 0.8631 | | Joy | 0.549 | 0.6363 | 0.5894 | | Trust | 0.4705 | 0.5581 | 0.5106 | | Macro AVG | 0.7134 | 0.6281 | 0.6497 | | Weighted AVG | 0.791 | 0.7791 | 0.7743 | ## Usage ```py from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("poltextlab/HunEmBERT8") model = AutoModelForSequenceClassification.from_pretrained("poltextlab/HunEmBERT8") ``` ### BibTeX entry and citation info If you use the model, please cite the following paper: Bibtex: ```bibtex @ARTICLE{10149341, author={{"U}veges, Istv{\'a}n and Ring, Orsolya}, journal={IEEE Access}, title={HunEmBERT: a fine-tuned BERT-model for classifying sentiment and emotion in political communication}, year={2023}, volume={11}, number={}, pages={60267-60278}, doi={10.1109/ACCESS.2023.3285536} } ```