DopeorNope
/

SOLARC-MOE-10.7Bx6

Text Generation

Mixture of Experts

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

The license is cc-by-nc-sa-4.0.

🐻‍❄️SOLARC-MOE-10.7Bx6🐻‍❄️

Model Details

Model Developers Seungyoo Lee(DopeorNope)

I am in charge of Large Language Models (LLMs) at Markr AI team in South Korea.

Input Models input text only.

Output Models generate text only.

Model Architecture
SOLARC-MOE-10.7Bx6 is an auto-regressive language model based on the SOLAR architecture.

Base Model

kyujinpy/Sakura-SOLAR-Instruct

Weyaxi/SauerkrautLM-UNA-SOLAR-Instruct

VAGOsolutions/SauerkrautLM-SOLAR-Instruct

fblgit/UNA-SOLAR-10.7B-Instruct-v1.0

jeonsworld/CarbonVillain-en-10.7B-v1

Implemented Method

I have built a model using the Mixture of Experts (MOE) approach, utilizing each of these models as the base.

I wanted to test if it was possible to compile with a non-power of 2, like with 6

Implementation Code

Load model


from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

repo = "DopeorNope/SOLARC-MOE-10.7Bx6"
OpenOrca = AutoModelForCausalLM.from_pretrained(
        repo,
        return_dict=True,
        torch_dtype=torch.float32,
        device_map='auto'
)
OpenOrca_tokenizer = AutoTokenizer.from_pretrained(repo)

Downloads last month: 1,241

Safetensors

Model size

53B params

Tensor type

F32

·

Text Generation

Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Spaces using DopeorNope/SOLARC-MOE-10.7Bx6 15