AkitoP commited on
Commit
9623385
1 Parent(s): 02c78d0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -1
README.md CHANGED
@@ -9,4 +9,18 @@ metrics:
9
  base_model:
10
  - openai/whisper-large-v3-turbo
11
  library_name: transformers
12
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  base_model:
10
  - openai/whisper-large-v3-turbo
11
  library_name: transformers
12
+ ---
13
+
14
+ # Whisper Large V3 Japanese Phone Accent
15
+
16
+ This is a Whisper model designed to transcribe Japanese speech into Katakana with pitch accent annotations. The model is built upon the whisper-large-v3-turbo and has been fine-tuned using a subset (1/20) of the Galgame-Speech dataset, as well as the jsut-5000 dataset.
17
+
18
+ ## Training Data:
19
+ - **Stage 1**: Audio from the Galgame-Speech dataset was used. The text was converted into Katakana sequences with pitch accent annotations using pyopenjtalk.
20
+ - **Stage 2**: JSUT-5000 dataset, using its original training set with pitch accent annotations. The data was split into 90% for training and 10% for evaluation.
21
+
22
+ ## Evaluation Results:
23
+ - The model achieved a CER (Character Error Rate) of approximately 4% on the JSUT-5000 test set, which is an improvement over the 7% CER of pyopenjtalk.
24
+ - Training only with Stage 1 resulted in a CER of 13%, with errors including specific misreadings and misclassification between on'yomi (音読) and kun'yomi (訓読) readings. This was improved in Stage 2.
25
+
26
+ We are currently seeking Japanese pitch accent annotated datasets. If you have such data, please reach out!