pronics2004 commited on
Commit
5069525
1 Parent(s): 6a5ae53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +39 -3
README.md CHANGED
@@ -1,3 +1,39 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ pipeline_tag: text-classification
6
+ ---
7
+
8
+ ## Model Description
9
+ This model is IBM's 12-layer toxicity binary classifier for English, intended to be used as a guardrail for any large language model. It has been trained on several benchmark datasets in English, specifically for detecting hateful, abusive, profane and other toxic content in plain text.
10
+
11
+
12
+ ## Model Usage
13
+ ```python
14
+ # Example of how to use the model
15
+ import torch
16
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
17
+ device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
18
+
19
+ model_name_or_path = 'ibm-granite/granite-guardian-hap-125m'
20
+ model = AutoModelForSequenceClassification.from_pretrained(model_name_or_path)
21
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path)
22
+ model.to(device)
23
+
24
+ # Sample text
25
+ text = ["This is the 1st test", "This is the 2nd test"]
26
+ input = tokenizer(text, padding=True, truncation=True, return_tensors="pt").to(device)
27
+
28
+ with torch.no_grad():
29
+ logits = model(**input).logits
30
+ prediction = torch.argmax(logits, dim=1).cpu().detach().numpy().tolist() # Binary prediction where label 1 indicates toxicity.
31
+ probability = torch.softmax(logits, dim=1).cpu().detach().numpy()[:,1].tolist() # Probability of toxicity.
32
+
33
+ ```
34
+
35
+ ## Performance Comparison with Other Models
36
+ This model demonstrates superior average performance in comparison with other models on eight mainstream toxicity benchmarks. If a very fast model is required, please refer to the lightweight 4-layer IBM model, granite-guardian-hap-38m.
37
+
38
+ ![Description of Image](125m_comparison_a.png)
39
+ ![Description of Image](125m_comparison_b.png)