DaFull commited on
Commit
1d0d1a5
1 Parent(s): e8e5310

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -44
README.md CHANGED
@@ -2,11 +2,12 @@
2
  tags:
3
  - spacy
4
  - token-classification
 
5
  language:
6
  - en
7
  license: mit
8
  model-index:
9
- - name: en_core_web_sm_job
10
  results:
11
  - task:
12
  name: NER
@@ -14,55 +15,79 @@ model-index:
14
  metrics:
15
  - name: NER Precision
16
  type: precision
17
- value: 0.8454836771
18
  - name: NER Recall
19
  type: recall
20
- value: 0.8456530449
21
  - name: NER F Score
22
  type: f_score
23
- value: 0.8455683525
24
  - task:
25
  name: TAG
26
  type: token-classification
27
  metrics:
28
  - name: TAG (XPOS) Accuracy
29
  type: accuracy
30
- value: 0.97246532
31
- - task:
32
- name: UNLABELED_DEPENDENCIES
33
- type: token-classification
34
- metrics:
35
- - name: Unlabeled Attachment Score (UAS)
36
- type: f_score
37
- value: 0.9175304332
38
- - task:
39
- name: LABELED_DEPENDENCIES
40
- type: token-classification
41
- metrics:
42
- - name: Labeled Attachment Score (LAS)
43
- type: f_score
44
- value: 0.89874821
45
- - task:
46
- name: SENTS
47
- type: token-classification
48
- metrics:
49
- - name: Sentences F-Score
50
- type: f_score
51
- value: 0.9059485531
52
  ---
53
- English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter, ner, attribute_ruler, lemmatizer.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
55
  | Feature | Description |
56
  | --- | --- |
57
- | **Name** | `en_core_web_sm_job` |
58
  | **Version** | `3.6.0` |
59
  | **spaCy** | `>=3.6.0,<3.7.0` |
60
  | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
61
  | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
62
- | **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
63
- | **Sources** | [OntoNotes 5](https://catalog.ldc.upenn.edu/LDC2013T19) (Ralph Weischedel, Martha Palmer, Mitchell Marcus, Eduard Hovy, Sameer Pradhan, Lance Ramshaw, Nianwen Xue, Ann Taylor, Jeff Kaufman, Michelle Franchini, Mohammed El-Bachouti, Robert Belvin, Ann Houston)<br>[ClearNLP Constituent-to-Dependency Conversion](https://github.com/clir/clearnlp-guidelines/blob/master/md/components/dependency_conversion.md) (Emory University)<br>[WordNet 3.0](https://wordnet.princeton.edu/) (Princeton University) |
64
  | **License** | `MIT` |
65
- | **Author** | [Explosion](https://explosion.ai) |
66
 
67
  ### Label Scheme
68
 
@@ -82,16 +107,7 @@ English pipeline optimized for CPU. Components: tok2vec, tagger, parser, senter,
82
 
83
  | Type | Score |
84
  | --- | --- |
85
- | `TOKEN_ACC` | 99.86 |
86
- | `TOKEN_P` | 99.57 |
87
- | `TOKEN_R` | 99.58 |
88
- | `TOKEN_F` | 99.57 |
89
- | `TAG_ACC` | 97.25 |
90
- | `SENTS_P` | 92.02 |
91
- | `SENTS_R` | 89.21 |
92
- | `SENTS_F` | 90.59 |
93
- | `DEP_UAS` | 91.75 |
94
- | `DEP_LAS` | 89.87 |
95
- | `ENTS_P` | 84.55 |
96
- | `ENTS_R` | 84.57 |
97
- | `ENTS_F` | 84.56 |
 
2
  tags:
3
  - spacy
4
  - token-classification
5
+ - ner
6
  language:
7
  - en
8
  license: mit
9
  model-index:
10
+ - name: en_core_web_sm_job
11
  results:
12
  - task:
13
  name: NER
 
15
  metrics:
16
  - name: NER Precision
17
  type: precision
18
+ value: 0.7516398746
19
  - name: NER Recall
20
  type: recall
21
+ value: 0.6069711538
22
  - name: NER F Score
23
  type: f_score
24
+ value: 0.6742971968
25
  - task:
26
  name: TAG
27
  type: token-classification
28
  metrics:
29
  - name: TAG (XPOS) Accuracy
30
  type: accuracy
31
+ value: 0.7334810915
32
+
33
+ library_name: spacy
34
+ pipeline_tag: text-classification
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ---
36
+ # Custom spaCy NER Model for "Profession," "Facility," and "Experience" Entities
37
+
38
+ ### Overview
39
+ This spaCy-based Named Entity Recognition (NER) model has been custom-trained to recognize and classify entities related to "profession," "facility," and "experience." It is designed to enhance your text analysis capabilities by identifying these specific entity types in unstructured text data.
40
+
41
+ ### Key Features
42
+ Custom-trained for high accuracy in recognizing "profession," "facility," and "experience" entities.
43
+ Suitable for various NLP tasks, such as information extraction, content categorization, and more.
44
+ Can be easily integrated into your existing spaCy-based NLP pipelines.
45
+
46
+ ### Usage
47
+ #### Installation
48
+ ##### You can install the custom spaCy NER model using pip:
49
+
50
+ ```bash
51
+ pip install https://huggingface.co/DaFull/en_core_web_sm_job /resolve/main/en_core_web_sm_job -any-py3-none-any.whl
52
+
53
+ ```
54
+ #### Example Usage
55
+ Here's how you can use the model for entity recognition in Python:
56
+
57
+ ```python
58
+
59
+ import spacy
60
+
61
+ # Load the custom spaCy NER model
62
+ nlp = spacy.load("en_core_web_sm_job ")
63
+
64
+ # Process your text
65
+ text = "HR Specialist needed at Google, Dallas, TX, with expertise in employee relations and a minimum of 4 years of HR experience."
66
+ doc = nlp(text)
67
+
68
+ # Extract named entities
69
+ for ent in doc.ents:
70
+ print(f"Entity: {ent.text}, Type: {ent.label_}")
71
+
72
+ ```
73
+
74
+ #### Entity Types
75
+ The model recognizes the following entity types:
76
+
77
+ - PROFESSION: Represents professions or job titles.
78
+ - FACILITY: Denotes facilities, buildings, or locations.
79
+ - EXPERIENCE: Identifies mentions of work experience, durations, or qualifications.
80
 
81
  | Feature | Description |
82
  | --- | --- |
83
+ | **Name** | `en_core_web_sm_job ` |
84
  | **Version** | `3.6.0` |
85
  | **spaCy** | `>=3.6.0,<3.7.0` |
86
  | **Default Pipeline** | `tok2vec`, `tagger`, `parser`, `attribute_ruler`, `lemmatizer`, `ner` |
87
  | **Components** | `tok2vec`, `tagger`, `parser`, `senter`, `attribute_ruler`, `lemmatizer`, `ner` |
88
+ | **Vectors** | 514157 keys, 514157 unique vectors (300 dimensions) |
 
89
  | **License** | `MIT` |
90
+
91
 
92
  ### Label Scheme
93
 
 
107
 
108
  | Type | Score |
109
  | --- | --- |
110
+ | `TOKEN_P` | 75.57 |
111
+ | `TOKEN_R` | 60.58 |
112
+ | `TOKEN_F` | 67.57 |
113
+ | `CUSTOM_TAG_ACC` | 73.35 |