Mxode commited on
Commit
6dcd97c
1 Parent(s): d1f3901

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -37,7 +37,7 @@ This is NanoLM-0.3B-Instruct-v1.1. The model currently supports both **Chinese a
37
  | :----------: | :------------------: | :---: | :----: | :-------: | :---: | :---: |
38
  | 25M | 15M | MistralForCausalLM | 12 | 312 | 12 |2K|
39
  | 70M | 42M | LlamaForCausalLM | 12 | 576 | 9 |2K|
40
- | 0.3B | 180M | Qwen2ForCausalLM | 12 | 896 | 14 |4K|
41
  | 1B | 840M | Qwen2ForCausalLM | 18 | 1536 | 12 |4K|
42
 
43
  The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1.1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12.
 
37
  | :----------: | :------------------: | :---: | :----: | :-------: | :---: | :---: |
38
  | 25M | 15M | MistralForCausalLM | 12 | 312 | 12 |2K|
39
  | 70M | 42M | LlamaForCausalLM | 12 | 576 | 9 |2K|
40
+ | **0.3B** | **180M** | **Qwen2ForCausalLM** | **12** | **896** | **14** | **4K** |
41
  | 1B | 840M | Qwen2ForCausalLM | 18 | 1536 | 12 |4K|
42
 
43
  The tokenizer and model architecture of NanoLM-0.3B-Instruct-v1.1 are the same as [Qwen/Qwen2-0.5B](https://huggingface.co/Qwen/Qwen2-0.5B), but the number of layers has been reduced from 24 to 12.