|
--- |
|
library_name: transformers |
|
license: apache-2.0 |
|
base_model: bert-base-uncased |
|
tags: |
|
- generated_from_keras_callback |
|
model-index: |
|
- name: huseyincenik/conll_ner_with_bert |
|
results: [] |
|
datasets: |
|
- tner/conll2003 |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
pipeline_tag: token-classification |
|
--- |
|
|
|
<!-- This model card has been generated automatically according to the information Keras had access to. You should |
|
probably proofread and complete it, then remove this comment. --> |
|
|
|
# huseyincenik/conll_ner_with_bert |
|
|
|
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the CoNLL-2003 dataset for Named Entity Recognition (NER). |
|
|
|
## Model description |
|
|
|
This model has been trained to perform Named Entity Recognition (NER) and is based on the BERT architecture. It was fine-tuned on the CoNLL-2003 dataset, a standard dataset for NER tasks. |
|
|
|
## Intended uses & limitations |
|
|
|
### Intended Uses |
|
|
|
- **Named Entity Recognition**: This model is designed to identify and classify named entities in text into categories such as location (LOC), organization (ORG), person (PER), and miscellaneous (MISC). |
|
|
|
### Limitations |
|
|
|
- **Domain Specificity**: The model was fine-tuned on the CoNLL-2003 dataset, which consists of news articles. It may not generalize well to other domains or types of text not represented in the training data. |
|
- **Subword Tokens**: The model may occasionally tag subword tokens as entities, requiring post-processing to handle these cases. |
|
|
|
## Training and evaluation data |
|
- **Training Dataset**: CoNLL-2003 |
|
|
|
- **Training Evaluation Metrics**: |
|
| Label | Precision | Recall | F1-Score | Support | |
|
|---------|-----------|--------|----------|---------| |
|
| B-PER | 0.98 | 0.98 | 0.98 | 11273 | |
|
| I-PER | 0.98 | 0.99 | 0.99 | 9323 | |
|
| B-ORG | 0.88 | 0.92 | 0.90 | 10447 | |
|
| I-ORG | 0.81 | 0.92 | 0.86 | 5137 | |
|
| B-LOC | 0.86 | 0.94 | 0.90 | 9621 | |
|
| I-LOC | 1.00 | 0.08 | 0.14 | 1267 | |
|
| B-MISC | 0.81 | 0.73 | 0.77 | 4793 | |
|
| I-MISC | 0.83 | 0.36 | 0.50 | 1329 | |
|
| **Micro Avg** | **0.90** | **0.90** | **0.90** | **53190** | |
|
| **Macro Avg** | **0.89** | **0.74** | **0.75** | **53190** | |
|
| **Weighted Avg** | **0.90** | **0.90** | **0.89** | **53190** | |
|
|
|
|
|
- **Validation Evaluation Metrics**: |
|
| Label | Precision | Recall | F1-Score | Support | |
|
|---------|-----------|--------|----------|---------| |
|
| B-PER | 0.97 | 0.98 | 0.97 | 3018 | |
|
| I-PER | 0.98 | 0.98 | 0.98 | 2741 | |
|
| B-ORG | 0.86 | 0.91 | 0.88 | 2056 | |
|
| I-ORG | 0.77 | 0.81 | 0.79 | 900 | |
|
| B-LOC | 0.86 | 0.94 | 0.90 | 2618 | |
|
| I-LOC | 1.00 | 0.10 | 0.18 | 281 | |
|
| B-MISC | 0.77 | 0.74 | 0.76 | 1231 | |
|
| I-MISC | 0.77 | 0.34 | 0.48 | 390 | |
|
| **Micro Avg** | **0.90** | **0.89** | **0.89** | **13235** | |
|
| **Macro Avg** | **0.87** | **0.73** | **0.74** | **13235** | |
|
| **Weighted Avg** | **0.90** | **0.89** | **0.88** | **13235** | |
|
|
|
|
|
- **Test Evaluation Metrics**: |
|
| Label | Precision | Recall | F1-Score | Support | |
|
|---------|-----------|--------|----------|---------| |
|
| B-PER | 0.96 | 0.95 | 0.96 | 2714 | |
|
| I-PER | 0.98 | 0.99 | 0.98 | 2487 | |
|
| B-ORG | 0.81 | 0.87 | 0.84 | 2588 | |
|
| I-ORG | 0.74 | 0.87 | 0.80 | 1050 | |
|
| B-LOC | 0.81 | 0.90 | 0.85 | 2121 | |
|
| I-LOC | 0.89 | 0.12 | 0.22 | 276 | |
|
| B-MISC | 0.75 | 0.67 | 0.71 | 996 | |
|
| I-MISC | 0.85 | 0.49 | 0.62 | 241 | |
|
| **Micro Avg** | **0.87** | **0.88** | **0.87** | **12473** | |
|
| **Macro Avg** | **0.85** | **0.73** | **0.75** | **12473** | |
|
| **Weighted Avg** | **0.87** | **0.88** | **0.86** | **12473** | |
|
|
|
|
|
|
|
|
|
## Training procedure |
|
|
|
### Training Hyperparameters |
|
|
|
- **Optimizer**: AdamWeightDecay |
|
- Learning Rate: 2e-05 |
|
- Decay Schedule: PolynomialDecay |
|
- Warmup Steps: 0.1 |
|
- Weight Decay Rate: 0.01 |
|
|
|
- training_precision: float32 |
|
|
|
### Training results |
|
|
|
| Train Loss | Validation Loss | Epoch | |
|
|:----------:|:---------------:|:-----:| |
|
| 0.1016 | 0.0254 | 0 | |
|
| 0.0228 | 0.0180 | 1 | |
|
|
|
### Optimizer Details |
|
|
|
```python |
|
from transformers import create_optimizer |
|
|
|
batch_size = 32 |
|
num_train_epochs = 2 |
|
num_train_steps = (len(tokenized_conll["train"]) // batch_size) * num_train_epochs |
|
|
|
optimizer, lr_schedule = create_optimizer( |
|
init_lr=2e-5, |
|
num_train_steps=num_train_steps, |
|
weight_decay_rate=0.01, |
|
num_warmup_steps=0.1 |
|
) |
|
``` |
|
|
|
## How to Use |
|
|
|
### Using a Pipeline |
|
|
|
```python |
|
from transformers import pipeline |
|
|
|
pipe = pipeline("token-classification", model="huseyincenik/conll_ner_with_bert") |
|
|
|
from transformers import AutoTokenizer, AutoModelForTokenClassification |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert") |
|
model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert") |
|
|
|
``` |
|
|
|
Abbreviation|Description |
|
-|- |
|
O|Outside of a named entity |
|
B-MISC |Beginning of a miscellaneous entity right after another miscellaneous entity |
|
I-MISC | Miscellaneous entity |
|
B-PER |Beginning of a person’s name right after another person’s name |
|
I-PER |Person’s name |
|
B-ORG |Beginning of an organization right after another organization |
|
I-ORG |organization |
|
B-LOC |Beginning of a location right after another location |
|
I-LOC |Location |
|
|
|
|
|
### CoNLL-2003 English Dataset Statistics |
|
This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper. |
|
|
|
#### # of training examples per entity type |
|
Dataset|LOC|MISC|ORG|PER |
|
-|-|-|-|- |
|
Train|7140|3438|6321|6600 |
|
Dev|1837|922|1341|1842 |
|
Test|1668|702|1661|1617 |
|
|
|
#### # of articles/sentences/tokens per dataset |
|
Dataset |Articles |Sentences |Tokens |
|
-|-|-|- |
|
Train |946 |14,987 |203,621 |
|
Dev |216 |3,466 |51,362 |
|
Test |231 |3,684 |46,435 |
|
|
|
### Framework versions |
|
|
|
- Transformers 4.45.0.dev0 |
|
- TensorFlow 2.17.0 |
|
- Datasets 2.21.0 |
|
- Tokenizers 0.19.1 |