Edit model card

Open_Gpt4_v0.2

This is the un-quantized fp16 version for training and merging. If you want the quantized version for inference please refer to the repo bellow:

image/jpeg

This model is a TIES merger of Mixtral-8x7B-Instruct-v0.1 and bagel-dpo-8x7b-v0.2 with MixtralOrochi8x7B being the Base model.

I was very impressed with MixtralOrochi8x7B performance and multifaceted usecases as it is already a merger of many usefull Mixtral models such as Mixtral instruct, Noromaid-v0.1-mixtral, openbuddy-mixtral and possibly other models that were not named. My goal was to expand the models capabilities and make it even more useful of a model, maybe even competitive with closed source models like Gpt-4. But for that more testing is required. I hope the community can help me determine if its deserving of its name. ๐Ÿ˜Š

This is the second iteration of this model, using better models in the merger to improve performance (hopefully).

Base model:

Merged models:

Instruct template: Alpaca

Merger config:

models:
  - model: Mixtral-8x7B-Instruct-v0.1

    parameters:
      density: .5
      weight: 1
  - model: bagel-dpo-8x7b-v0.2
    parameters:
      density: .5
      weight: .7


merge_method: ties
base_model: MixtralOrochi8x7B
parameters:
  normalize: true
  int8_mask: true
dtype: float16


Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 73.59
AI2 Reasoning Challenge (25-Shot) 68.69
HellaSwag (10-Shot) 86.16
MMLU (5-Shot) 72.07
TruthfulQA (0-shot) 71.92
Winogrande (5-shot) 83.58
GSM8k (5-shot) 59.14
Downloads last month
1,215
Safetensors
Model size
46.7B params
Tensor type
FP16
ยท
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Space using rombodawg/Open_Gpt4_8x7B_v0.2 1

Evaluation results