Cyrile commited on
Commit
d750423
1 Parent(s): 6f50991

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -23,7 +23,7 @@ This kind of modeling can be ideal for monitoring and controlling the output of
23
  Training
24
  --------
25
 
26
- The training dataset consists of 500k examples of comments in English and 500k comments in French (translated by Google Translate), each annotated with a toxicity severity graduation. The dataset used is provided by [Jigsaw](https://jigsaw.google.com/) as part of a Kaggle competition : [Jigsaw Unintended Bias in Toxicity Classification](https://www.kaggle.com/competitions/jigsaw-unintended-bias-in-toxicity-classification/data). Since the scores represent severity graduations, regression was preferred using the following loss function:
27
  $$loss=l_{\mathrm{obscene}}+l_{\mathrm{sexual\_explicit}}+l_{\mathrm{identity\_attack}}+l_{\mathrm{insult}}+l_{\mathrm{threat}}$$
28
  with
29
  $$l_i=\frac{1}{\vert\mathcal{O}\vert}\sum_{o\in\mathcal{O}}\vert\mathrm{score}_{i,o}-\sigma(\mathrm{logit}_{i,o})\vert$$
 
23
  Training
24
  --------
25
 
26
+ The training dataset consists of 500k examples of comments in English and 500k comments in French (translated by Google Translate), each annotated with a toxicity severity graduation. The dataset used is provided by [Jigsaw](https://jigsaw.google.com/approach/) as part of a Kaggle competition : [Jigsaw Unintended Bias in Toxicity Classification](https://www.kaggle.com/competitions/jigsaw-unintended-bias-in-toxicity-classification/data). Since the scores represent severity graduations, regression was preferred using the following loss function:
27
  $$loss=l_{\mathrm{obscene}}+l_{\mathrm{sexual\_explicit}}+l_{\mathrm{identity\_attack}}+l_{\mathrm{insult}}+l_{\mathrm{threat}}$$
28
  with
29
  $$l_i=\frac{1}{\vert\mathcal{O}\vert}\sum_{o\in\mathcal{O}}\vert\mathrm{score}_{i,o}-\sigma(\mathrm{logit}_{i,o})\vert$$