--- license: afl-3.0 language: - vi pipeline_tag: token-classification tags: - vietnamese - accents inserter --- # A Transformer model for inserting Vietnamese accent marks This model is finetuned from the XLM-Roberta Large. Example input: Toi di hoc. Target output: Tôi đi học. ## Model training This problem was modelled as a token classification problem. For each input token, the goal is to asssign a "tag" that will transform it to the accented token. For more details on the training process, please refer to this [blog post](https://peterhung.org/tech/insert-vietnamese-accent-transformer-model/). ## How to use this model There are 2 main steps: - Load the model as a token classification model (*AutoModelForTokenClassification*). - Run the input through the model to obtain the tag index for each input token. - Use the tags' index to retreive the actual tags in the file *selected_tags_names.txt*. - Apply the transformation to each token to obtain accented tokens.