Mxode commited on
Commit
779cbd7
1 Parent(s): 7eef1ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -3
README.md CHANGED
@@ -1,3 +1,62 @@
1
- ---
2
- license: gpl-3.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: gpl-3.0
3
+ ---
4
+ # NanoLM-70M-Instruct-v1
5
+
6
+
7
+ English | [简体中文](README_zh-CN.md)
8
+
9
+
10
+ ## Introduction
11
+
12
+ In order to explore the potential of small models, I have attempted to build a series of them, which are available in the [NanoLM Collections](https://huggingface.co/collections/Mxode/nanolm-66d6d75b4a69536bca2705b2).
13
+
14
+ This is NanoLM-70M-Instruct-v1. The model currently supports **English only**.
15
+
16
+
17
+
18
+ ## Model Details
19
+
20
+ The tokenizer and model architecture of NanoLM-70M-Instruct-v1 are the same as [SmolLM-135M](https://huggingface.co/HuggingFaceTB/SmolLM-135M), but the number of layers has been reduced from 30 to 12.
21
+
22
+ As a result, NanoLM-70M-Instruct-v1 has only 70 million parameters.
23
+
24
+ Despite this, NanoLM-70M-Instruct-v1 still demonstrates instruction-following capabilities.
25
+
26
+
27
+
28
+ ## How to use
29
+
30
+ ```python
31
+ import torch
32
+ from transformers import AutoModelForCausalLM, AutoTokenizer
33
+
34
+ model_path = 'Mxode/NanoLM-70M-Instruct-v1'
35
+
36
+ model = AutoModelForCausalLM.from_pretrained(model_path).to('cuda:0', torch.bfloat16)
37
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
38
+
39
+
40
+ text = "Why is it important for entrepreneurs to prioritize financial management?"
41
+ prompt = tokenizer.apply_chat_template(
42
+ [
43
+ {'role': 'system', 'content': 'You are a helpful assistant.'},
44
+ {'role': 'user', 'content': text}
45
+ ],
46
+ add_generation_prompt=True,
47
+ tokenize=True,
48
+ return_tensors='pt'
49
+ ).to('cuda:0')
50
+
51
+
52
+ outputs = model.generate(
53
+ prompt,
54
+ max_new_tokens=1024,
55
+ do_sample=True,
56
+ temperature=0.7,
57
+ repetition_penalty=1.1,
58
+ eos_token_id=tokenizer.eos_token_id,
59
+ )
60
+ response = tokenizer.decode(outputs[0])
61
+ print(response)
62
+ ```