Crystalcareai
/

llama-3-4x8b

Text Generation

Model card Files Files and versions Community

Crystalcareai commited on Apr 19

Commit

45237dc

•

1 Parent(s): 0f236b2

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -1,9 +1,10 @@
 This is an MOE of Llama-3-8b with 4 experts. This does not use semantic routing, as this utilizes the deepseek-moe architecture. There is no routing, and there is no gate - all experts are active on every token.
-```import torch
 from transformers import AutoTokenizer, TextStreamer, AutoModelForCausalLM
-model_path = "./content"
 model = AutoModelForCausalLM.from_pretrained(
     model_path,
     device_map="auto",

 This is an MOE of Llama-3-8b with 4 experts. This does not use semantic routing, as this utilizes the deepseek-moe architecture. There is no routing, and there is no gate - all experts are active on every token.
+```python
+import torch
 from transformers import AutoTokenizer, TextStreamer, AutoModelForCausalLM
+model_path = "Crystalcareai/llama-3-4x8b"
 model = AutoModelForCausalLM.from_pretrained(
     model_path,
     device_map="auto",