Edit model card

MetaModel_moe

This model is a Mixure of Experts (MoE) made with mergekit (mixtral branch). It uses the following base models:

🧩 Configuration

base_model: gagan3012/MetaModel
gate_mode: hidden
dtype: bfloat16
experts:
- source_model: gagan3012/MetaModel
- source_model: jeonsworld/CarbonVillain-en-10.7B-v2
- source_model: jeonsworld/CarbonVillain-en-10.7B-v4
- source_model: TomGrc/FusionNet_linear

πŸ’» Usage

!pip install -qU transformers bitsandbytes accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "gagan3012/MetaModel_moe"

tokenizer = AutoTokenizer.from_pretrained(model)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 74.42
ARC (25-shot) 71.25
HellaSwag (10-shot) 88.4
MMLU (5-shot) 66.26
TruthfulQA (0-shot) 71.86
Winogrande (5-shot) 83.35
GSM8K (5-shot) 65.43
Downloads last month
1,656
Safetensors
Model size
36.1B params
Tensor type
BF16
Β·
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Spaces using gagan3012/MetaModel_moe 11

Collection including gagan3012/MetaModel_moe