Cyrile commited on
Commit
0f6512c
1 Parent(s): b79a0cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -28,12 +28,14 @@ Benchmark
28
 
29
  As the scores range from 0 to 1, a performance measure such as MAE or RMSE may be challenging to interpret. Therefore, Pearson's inter-correlation was chosen as a measure. Pearson's inter-correlation is a measure ranging from -1 to 1, where 0 represents no correlation, -1 represents perfect negative correlation, and 1 represents perfect positive correlation. The goal is to quantitatively measure the correlation between the model's scores and the scores assigned by judges for 750 comments not seen during training.
30
 
31
- | Model | Language | Obsecene (x100) | Sexual explicit (x100) | Identity attack (x100) | Insult (x100) | Threat (x100) |
32
- |-------------------------------------------------------------------------------|----------|:-----------------------:|-------------------------------|-------------------------------|----------------------|----------------------|
33
- | [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | French | 62 | 73 | 73 | 68 | 61 |
34
- | [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | English | 63 | 61 | 63 | 67 | 55 |
35
- | [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail) | Frnech | 72 | 82 | 80 | 78 | 77 |
36
- | [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail) | English | 76 | 78 | 77 | 75
 
 
37
 
38
  Citation
39
  --------
 
28
 
29
  As the scores range from 0 to 1, a performance measure such as MAE or RMSE may be challenging to interpret. Therefore, Pearson's inter-correlation was chosen as a measure. Pearson's inter-correlation is a measure ranging from -1 to 1, where 0 represents no correlation, -1 represents perfect negative correlation, and 1 represents perfect positive correlation. The goal is to quantitatively measure the correlation between the model's scores and the scores assigned by judges for 750 comments not seen during training.
30
 
31
+ | Model | Language | Obsecene (x100) | Sexual explicit (x100) | Identity attack (x100) | Insult (x100) | Threat (x100) | Mean |
32
+ |-------------------------------------------------------------------------------|----------|:-----------------------:|-------------------------------|-------------------------------|----------------------|----------------------|------|
33
+ | [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | French | 62 | 73 | 73 | 68 | 61 | 67 |
34
+ | [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | English | 63 | 61 | 63 | 67 | 55 | 62 |
35
+ | [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail) | Frnech | 72 | 82 | 80 | 78 | 77 | 78 |
36
+ | [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail) | English | 76 | 78 | 77 | 75 | 79 | 77 |
37
+
38
+ With a correlation of approximately 60 for the 560m model and approximately 80 for the 3b model, the output is highly correlated with the judges' scores.
39
 
40
  Citation
41
  --------