mT0-XL-detox-orpo

Resources:

Model Information

This is a multilingual 3.7B text detoxification model for 9 languages built on TextDetox 2024 shared task based on mT0-XL. The model was trained in a two-step setup: the first step is full fine-tuning on different parallel text detoxification datasets, and the second step is ORPO alignment on a self-annotated preference dataset collected using toxicity and similarity classifiers. See the paper for more details.

In terms of human evaluation, the model is a second-best approach on the TextDetox 2024 shared task. More precisely, the model shows state-of-the-art performance for the Ukrainian language, top-2 scores for Arabic, and near state-of-the-art performance for other languages.

Example usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained('s-nlp/mt0-xl-detox-orpo', device_map="auto")
tokenizer = AutoTokenizer.from_pretrained('s-nlp/mt0-xl-detox-orpo')

LANG_PROMPTS = {
   'zh': '排毒：',
   'es': 'Desintoxicar: ',
   'ru': 'Детоксифицируй: ',
   'ar': 'إزالة السموم: ',
   'hi': 'विषहरण: ',
   'uk': 'Детоксифікуй: ',
   'de': 'Entgiften: ',
   'am': 'መርዝ መርዝ: ',
   'en': 'Detoxify: ',
}

def detoxify(text, lang, model, tokenizer):
   encodings = tokenizer(LANG_PROMPTS[lang] + text, return_tensors='pt').to(model.device)
   
   outputs = model.generate(**encodings.to(model.device), 
                            max_length=128,
                            num_beams=10,
                            no_repeat_ngram_size=3,
                            repetition_penalty=1.2,
                            num_beam_groups=5,
                            diversity_penalty=2.5,
                            num_return_sequences=5,
                            early_stopping=True,
                            )
   
   return tokenizer.batch_decode(outputs, skip_special_tokens=True)

s-nlp
/

mt0-xl-detox-orpo

mT0-XL-detox-orpo

Model Information

Example usage

Datasets used to train s-nlp/mt0-xl-detox-orpo