peterhung commited on
Commit
cfe60a2
1 Parent(s): e711ddf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -3
README.md CHANGED
@@ -1,3 +1,31 @@
1
- ---
2
- license: afl-3.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: afl-3.0
3
+ language:
4
+ - vi
5
+ pipeline_tag: token-classification
6
+ tags:
7
+ - vietnamese
8
+ - accents inserter
9
+ ---
10
+
11
+ # A Transformer model for inserting Vietnamese accent marks
12
+
13
+ This model is finetuned from the XLM-Roberta Large.
14
+
15
+ Example input: Toi di hoc.
16
+ Target output: Tôi đi học.
17
+
18
+ ## Model training
19
+ This problem was modelled as a token classification problem. For each input token, the goal is to asssign a "tag" that will transform it
20
+ to the accented token.
21
+ For more details on the training process, please refer to this [blog post](https://peterhung.org/tech/insert-vietnamese-accent-transformer-model/).
22
+
23
+ ## How to use this model
24
+ There are 2 main steps:
25
+ - Load the model as a token classification model (*AutoModelForTokenClassification*).
26
+ - Run the input through the model to obtain the tag index for each input token.
27
+ - Use the tags' index to retreive the actual tags in the file *selected_tags_names.txt*.
28
+ - Apply the transformation to each token to obtain accented tokens.
29
+
30
+
31
+