retrieva-jp
/

bert-1.3b

Model card Files Files and versions Community

jnishi commited on 25 days ago

Commit

1f5ceb2

•

1 Parent(s): d6454e0

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ language:
 - en
 ---
-# Retrieva BERT Model
 The **RetrievaBERT** is the pre-trained Transformer Encoder using Megatron-LM.
 It is designed for use in Japanese.
@@ -70,7 +70,7 @@ For detailed configuration, refer to the config.json file.
 ## Training Details
 ### Training Data
-The Retrieva BERT model was pre-trained on the reunion of five datasets:
 - [Japanese CommonCrawl Dataset by LLM-jp](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v2).
 - [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
 - Chinese Wikipedia dumped on 20240120.
@@ -112,7 +112,7 @@ We adjusted the learning rate and training epochs for each model and task in acc
 ## Technical Specifications
 ### Model Architectures
-The Retrieva BERT model is based on BERT with the following hyperparameters:
 - Number of layers: 48
 - Hidden layer size: 1536

 - en
 ---
+# RetrievaBERT Model
 The **RetrievaBERT** is the pre-trained Transformer Encoder using Megatron-LM.
 It is designed for use in Japanese.
 ## Training Details
 ### Training Data
+The RetrievaBERT model was pre-trained on the reunion of five datasets:
 - [Japanese CommonCrawl Dataset by LLM-jp](https://gitlab.llm-jp.nii.ac.jp/datasets/llm-jp-corpus-v2).
 - [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb).
 - Chinese Wikipedia dumped on 20240120.
 ## Technical Specifications
 ### Model Architectures
+The RetrievaBERT model is based on BERT with the following hyperparameters:
 - Number of layers: 48
 - Hidden layer size: 1536