poltextlab
/

HunEmBERT8

Text Classification

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

HunEmBERT8 / README.md

poltextlab's picture

Update README.md

3fd060a over 1 year ago

|

1.65 kB

	---
	license: apache-2.0
	language:
	- hu
	metrics:
	- accuracy
	model-index:
	- name: huBERTPlain
	results:
	- task:
	type: text-classification
	metrics:
	- type: f1
	value: 0.77
	---

	## Model description

	Cased fine-tuned BERT model for Hungarian, trained on (manuallay anniated) parliamentary pre-agenda speeches scraped from `parlament.hu`.

	## Intended uses & limitations

	The model can be used as any other (cased) BERT model. It has been tested recognizing emotions at the sentence level in (parliamentary) pre-agenda speeches, where:
	* 'Label_0': Neutral
	* 'Label_1': Fear
	* 'Label_3': Sadness
	* 'Label_4': Anger
	* 'Label_5': Disgust
	* 'Label_6': Success
	* 'Label_7': Joy

	## Training

	Fine-tuned version of the original huBERT model (`SZTAKI-HLT/hubert-base-cc`), trained on HunEmPoli corpus.

	## Eval results

	\| Class \| Precision \| Recall \| F-Score \|
	\|-----\|------------\|------------\|------\|
	\| Fear \| 0.625 \| 0.625 \| 0.625 \|
	\| Sadness \| 0.8535 \| 0.6291 \| 0.7243 \|
	\| Anger \| 0.7857 \| 0.3437 \| 0.4782 \|
	\| Disgust \| 0.7154 \| 0.8790 \| 0.7888 \|
	\| Success \| 0.8579 \| 0.8683 \| 0.8631 \|
	\| Joy \| 0.549 \| 0.6363 \| 0.5894 \|
	\| Trust \| 0.4705 \| 0.5581 \| 0.5106 \|
	\| Macro AVG \| 0.7134 \| 0.6281 \| 0.6497 \|
	\| Weighted AVG \| 0.791 \| 0.7791 \| 0.7743 \|


	## Usage

	```py
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	tokenizer = AutoTokenizer.from_pretrained("poltextlab/HunEmBERT8")
	model = AutoModelForSequenceClassification.from_pretrained("poltextlab/HunEmBERT8")
	```

	### BibTeX entry and citation info

	If you use the model, please cite the following paper:

	Bibtex:
	```bibtex
	@{
	}
	```