mms
vineelpratap commited on
Commit
7116a2a
1 Parent(s): b2c866d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -3
README.md CHANGED
@@ -8,6 +8,17 @@ tags:
8
 
9
  This repository consists of the n-gram language models trained on Common Crawl data ([Conneau et al. 2020b](https://aclanthology.org/2020.acl-main.747/), [NLLB_Team et al. 2022](https://arxiv.org/abs/2207.04672)) using [KenLM library](https://github.com/kpu/kenlm).
10
 
 
 
 
 
 
 
 
 
 
 
 
11
  ## Table Of Content
12
 
13
  - [Example](#example)
@@ -17,10 +28,8 @@ This repository consists of the n-gram language models trained on Common Crawl d
17
 
18
  ## Example
19
 
20
- ```py
21
 
22
- TODO
23
- ```
24
 
25
  ## Supported Languages
26
 
 
8
 
9
  This repository consists of the n-gram language models trained on Common Crawl data ([Conneau et al. 2020b](https://aclanthology.org/2020.acl-main.747/), [NLLB_Team et al. 2022](https://arxiv.org/abs/2207.04672)) using [KenLM library](https://github.com/kpu/kenlm).
10
 
11
+
12
+ For the following languages, the LMs are not present in the repository (due to 50GB limit on HuggingFace) and can be downloaded using the link provided here.
13
+
14
+ Mandarin Chinese (Simplified) - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/cmn-script_simplified/char_20gram.bin)
15
+
16
+ Japanese - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/jpn/char_20gram.bin)
17
+
18
+ Thai - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/tha/char_20gram.bin)
19
+
20
+ Cantonese(Traditional) - [Download LM](https:://dl.fbaipublicfiles.com/mms/lms/yue-script_traditional/char_20gram.bin)
21
+
22
  ## Table Of Content
23
 
24
  - [Example](#example)
 
28
 
29
  ## Example
30
 
31
+ Checkout the code here - https://huggingface.co/spaces/mms-meta/MMS/blob/main/asr.py which uses LMs for decoding the output from ASR models.
32
 
 
 
33
 
34
  ## Supported Languages
35