Edit model card

Model Description

We present a large classification model trained on a manually curated real-world dataset that can be used as a new benchmark for advancing research in toxicity detection and classification. We started with the original weights from the WavLM base plus and fine-tuned it with 2,374 hours of voice chat audio clips for multilabel classification. The audio clips are automatically labeled using a synthetic data pipeline described in our blog post. A single output can have multiple labels. The model outputs a n by 6 output tensor where the inferred labels are Profanity, DatingAndSexting, Racist, Bullying, Other, NoViolation. Other consists of policy violation categories with low prevalence such as drugs and alcohol or self-harm that are combined into a single category.

We evaluated this model on a data set with human annotated labels that contained a total of 9,795 samples with the class distribution shown below. Note that we did not include the "other" category in this evaluation data set.

Class Number of examples Duration (hours) % of dataset
Profanity 4893 15.38 49.95%
DatingAndSexting 688 2.52 7.02%
Racist 889 3.10 9.08%
Bullying 1256 4.25 12.82%
NoViolation 4185 9.93 42.73%

If we set the same threshold across all classes and treat the model as a binary classifier across all 4 toxicity classes (Profanity, DatingAndSexting, Racist, Bullying), we get a binarized average precision of 94.48%. The precision recall curve is as shown below.

PR Curve

Usage

The dependencies for the inference file can be installed as follows:

pip install -r requirements.txt

The inference file contains useful helper functions to preprocess the audio file for proper inference. To run the inference file, please run the following command:

python inference.py --audio_file <your audio file path> --model_path <path to Huggingface model>

You can get the model weights either by downloading from the model releases page here, or from HuggingFace under roblox/voice-safety-classifier. If model_path isn’t specified, the model will be loaded directly from HuggingFace.

Downloads last month
596
Safetensors
Model size
94.6M params
Tensor type
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.