Mxode
/

NanoLM-70M-Instruct-v1

Text2Text Generation

Model card Files Files and versions Community

Mxode commited on 24 days ago

Commit

2ee8f54

•

1 Parent(s): cce4818

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -21,6 +21,8 @@ This is NanoLM-70M-Instruct-v1. The model currently supports **English only**.
 The tokenizer and model architecture of NanoLM-70M-Instruct-v1 are the same as [SmolLM-135M](https://huggingface.co/HuggingFaceTB/SmolLM-135M), but the number of layers has been reduced from 30 to 12.
 As a result, NanoLM-70M-Instruct-v1 has only 70 million parameters.
 Despite this, NanoLM-70M-Instruct-v1 still demonstrates instruction-following capabilities.

 The tokenizer and model architecture of NanoLM-70M-Instruct-v1 are the same as [SmolLM-135M](https://huggingface.co/HuggingFaceTB/SmolLM-135M), but the number of layers has been reduced from 30 to 12.
+Essentially, it is a pure LLaMA architecture, specifically LlamaForCausalLM.
 As a result, NanoLM-70M-Instruct-v1 has only 70 million parameters.
 Despite this, NanoLM-70M-Instruct-v1 still demonstrates instruction-following capabilities.