norabelrose commited on
Commit
4dd39be
1 Parent(s): 6e5549a

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +12 -0
  2. config.json +10 -0
  3. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ Mamba-2.8b-slimpj is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset.
5
+
6
+ Model code: https://github.com/state-spaces/mamba/tree/main
7
+
8
+ To load the model, follow the installation instruction in the code repo, and then:
9
+ ```
10
+ from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
11
+ model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj")
12
+ ```
config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "d_model": 2560,
3
+ "n_layer": 64,
4
+ "vocab_size": 50277,
5
+ "ssm_cfg": {},
6
+ "rms_norm": true,
7
+ "residual_in_fp32": true,
8
+ "fused_add_norm": true,
9
+ "pad_vocab_size_multiple": 8
10
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21263cfbd04dae22e68d36be9e3ca9d4a4784099bc09fc53338104a5f22131ef
3
+ size 1332860928