--- library_name: transformers license: apache-2.0 base_model: bert-base-uncased tags: - generated_from_keras_callback model-index: - name: huseyincenik/conll_ner_with_bert results: [] datasets: - tner/conll2003 language: - en metrics: - accuracy pipeline_tag: token-classification --- # huseyincenik/conll_ner_with_bert This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the CoNLL-2003 dataset for Named Entity Recognition (NER). ## Model description This model has been trained to perform Named Entity Recognition (NER) and is based on the BERT architecture. It was fine-tuned on the CoNLL-2003 dataset, a standard dataset for NER tasks. ## Intended uses & limitations ### Intended Uses - **Named Entity Recognition**: This model is designed to identify and classify named entities in text into categories such as location (LOC), organization (ORG), person (PER), and miscellaneous (MISC). ### Limitations - **Domain Specificity**: The model was fine-tuned on the CoNLL-2003 dataset, which consists of news articles. It may not generalize well to other domains or types of text not represented in the training data. - **Subword Tokens**: The model may occasionally tag subword tokens as entities, requiring post-processing to handle these cases. ## Training and evaluation data - **Training Dataset**: CoNLL-2003 - **Training Evaluation Metrics**: | Label | Precision | Recall | F1-Score | Support | |---------|-----------|--------|----------|---------| | B-PER | 0.98 | 0.98 | 0.98 | 11273 | | I-PER | 0.98 | 0.99 | 0.99 | 9323 | | B-ORG | 0.88 | 0.92 | 0.90 | 10447 | | I-ORG | 0.81 | 0.92 | 0.86 | 5137 | | B-LOC | 0.86 | 0.94 | 0.90 | 9621 | | I-LOC | 1.00 | 0.08 | 0.14 | 1267 | | B-MISC | 0.81 | 0.73 | 0.77 | 4793 | | I-MISC | 0.83 | 0.36 | 0.50 | 1329 | | **Micro Avg** | **0.90** | **0.90** | **0.90** | **53190** | | **Macro Avg** | **0.89** | **0.74** | **0.75** | **53190** | | **Weighted Avg** | **0.90** | **0.90** | **0.89** | **53190** | - **Validation Evaluation Metrics**: | Label | Precision | Recall | F1-Score | Support | |---------|-----------|--------|----------|---------| | B-PER | 0.97 | 0.98 | 0.97 | 3018 | | I-PER | 0.98 | 0.98 | 0.98 | 2741 | | B-ORG | 0.86 | 0.91 | 0.88 | 2056 | | I-ORG | 0.77 | 0.81 | 0.79 | 900 | | B-LOC | 0.86 | 0.94 | 0.90 | 2618 | | I-LOC | 1.00 | 0.10 | 0.18 | 281 | | B-MISC | 0.77 | 0.74 | 0.76 | 1231 | | I-MISC | 0.77 | 0.34 | 0.48 | 390 | | **Micro Avg** | **0.90** | **0.89** | **0.89** | **13235** | | **Macro Avg** | **0.87** | **0.73** | **0.74** | **13235** | | **Weighted Avg** | **0.90** | **0.89** | **0.88** | **13235** | - **Test Evaluation Metrics**: | Label | Precision | Recall | F1-Score | Support | |---------|-----------|--------|----------|---------| | B-PER | 0.96 | 0.95 | 0.96 | 2714 | | I-PER | 0.98 | 0.99 | 0.98 | 2487 | | B-ORG | 0.81 | 0.87 | 0.84 | 2588 | | I-ORG | 0.74 | 0.87 | 0.80 | 1050 | | B-LOC | 0.81 | 0.90 | 0.85 | 2121 | | I-LOC | 0.89 | 0.12 | 0.22 | 276 | | B-MISC | 0.75 | 0.67 | 0.71 | 996 | | I-MISC | 0.85 | 0.49 | 0.62 | 241 | | **Micro Avg** | **0.87** | **0.88** | **0.87** | **12473** | | **Macro Avg** | **0.85** | **0.73** | **0.75** | **12473** | | **Weighted Avg** | **0.87** | **0.88** | **0.86** | **12473** | ## Training procedure ### Training Hyperparameters - **Optimizer**: AdamWeightDecay - Learning Rate: 2e-05 - Decay Schedule: PolynomialDecay - Warmup Steps: 0.1 - Weight Decay Rate: 0.01 - training_precision: float32 ### Training results | Train Loss | Validation Loss | Epoch | |:----------:|:---------------:|:-----:| | 0.1016 | 0.0254 | 0 | | 0.0228 | 0.0180 | 1 | ### Optimizer Details ```python from transformers import create_optimizer batch_size = 32 num_train_epochs = 2 num_train_steps = (len(tokenized_conll["train"]) // batch_size) * num_train_epochs optimizer, lr_schedule = create_optimizer( init_lr=2e-5, num_train_steps=num_train_steps, weight_decay_rate=0.01, num_warmup_steps=0.1 ) ``` ## How to Use ### Using a Pipeline ```python from transformers import pipeline pipe = pipeline("token-classification", model="huseyincenik/conll_ner_with_bert") from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert") model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert") ``` Abbreviation|Description -|- O|Outside of a named entity B-MISC |Beginning of a miscellaneous entity right after another miscellaneous entity I-MISC | Miscellaneous entity B-PER |Beginning of a person’s name right after another person’s name I-PER |Person’s name B-ORG |Beginning of an organization right after another organization I-ORG |organization B-LOC |Beginning of a location right after another location I-LOC |Location ### CoNLL-2003 English Dataset Statistics This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper. #### # of training examples per entity type Dataset|LOC|MISC|ORG|PER -|-|-|-|- Train|7140|3438|6321|6600 Dev|1837|922|1341|1842 Test|1668|702|1661|1617 #### # of articles/sentences/tokens per dataset Dataset |Articles |Sentences |Tokens -|-|-|- Train |946 |14,987 |203,621 Dev |216 |3,466 |51,362 Test |231 |3,684 |46,435 ### Framework versions - Transformers 4.45.0.dev0 - TensorFlow 2.17.0 - Datasets 2.21.0 - Tokenizers 0.19.1