Edit model card

bert-base-cased-finetuned-conll2003-ner-v2

BERT ("bert-base-cased") finetuned on CoNLL-2003 (Conference on Computational Natural Language Learning).

The model performs named entity recognition (NER). It pertains to section 2 of chapter 7 of the Hugging Face "NLP Course" (https://huggingface.co/learn/nlp-course/chapter7/2).

It was trained using a custom PyTorch loop with Hugging Face Accelerate.

Code: https://github.com/sambitmukherjee/huggingface-notebooks/blob/main/course/en/chapter7/section2_pt.ipynb

Experiment tracking: https://wandb.ai/sadhaklal/bert-base-cased-finetuned-conll2003-ner-v2

Usage

from transformers import pipeline

model_checkpoint = "sadhaklal/bert-base-cased-finetuned-conll2003-ner-v2"
token_classifier = pipeline("token-classification", model=model_checkpoint, aggregation_strategy="simple")

print(token_classifier("My name is Sylvain and I work at Hugging Face in Brooklyn."))

Dataset

From the dataset page:

The shared task of CoNLL-2003 concerns language-independent named entity recognition. We will concentrate on four types of named entities: persons, locations, organizations and names of miscellaneous entities that do not belong to the previous three groups.

Examples: https://huggingface.co/datasets/conll2003/viewer

Metrics

Accuracy on the 'validation' split of CoNLL-2003: 0.9858

Precision on the 'validation' split of CoNLL-2003: 0.9243

Recall on the 'validation' split of CoNLL-2003: 0.947

F1 on the 'validation' split of CoNLL-2003: 0.9355

Downloads last month
9
Safetensors
Model size
108M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train sadhaklal/bert-base-cased-finetuned-conll2003-ner-v2