Edit model card

llama-3-neural-chat-v1-8b

image/png

Model Details

Model Description

I fine-tuned llama-3 8B on an approach similar to Intel's neural chat language model. I have slightly modified the data sources so it is stronger in coding, math, and writing. I use both SFT and DPO.

Quants

EXL2 @bartowski

GGUF @bartowski

Uses

This model has great performance in writing and coding.

Training Data

  • Open-Orca/SlimOrca-Dedup
  • jondurbin/airoboros-3.2
  • microsoft/orca-math-word-problems-200k
  • m-a-p/Code-Feedback
  • MaziyarPanahi/WizardLM_evol_instruct_V2_196k
  • mlabonne/orpo-dpo-mix-40k

Direct Use

Conversational AI.

Evaluations

Tasks Version Filter n-shot Metric Value Stderr
truthfulqa_mc2 2 none 0 acc 0.5627 ± 0.0154
gsm8k 3 strict-match 5 exact_match 0.5481 ± 0.0137
flexible-extract 5 exact_match 0.5557 ± 0.0137
agieval_nous N/A none 0 acc 0.3763 ± 0.0093
none 0 acc_norm 0.3665 ± 0.0093
- agieval_aqua_rat 1 none 0 acc 0.2087 ± 0.0255
none 0 acc_norm 0.2047 ± 0.0254
- agieval_logiqa_en 1 none 0 acc 0.3456 ± 0.0187
none 0 acc_norm 0.3594 ± 0.0188
- agieval_lsat_ar 1 none 0 acc 0.1826 ± 0.0255
none 0 acc_norm 0.1783 ± 0.0253
- agieval_lsat_lr 1 none 0 acc 0.3549 ± 0.0212
none 0 acc_norm 0.3451 ± 0.0211
- agieval_lsat_rc 1 none 0 acc 0.5242 ± 0.0305
none 0 acc_norm 0.5130 ± 0.0305
- agieval_sat_en 1 none 0 acc 0.6650 ± 0.0330
none 0 acc_norm 0.6505 ± 0.0333
- agieval_sat_en_without_passage 1 none 0 acc 0.4175 ± 0.0344
none 0 acc_norm 0.3738 ± 0.0338
- agieval_sat_math 1 none 0 acc 0.4227 ± 0.0334
none 0 acc_norm 0.3682 ± 0.0326

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 66.50
AI2 Reasoning Challenge (25-Shot) 60.84
HellaSwag (10-Shot) 84.13
MMLU (5-Shot) 64.69
TruthfulQA (0-shot) 56.34
Winogrande (5-shot) 78.22
GSM8k (5-shot) 54.81
Downloads last month
110
Safetensors
Model size
8.03B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Datasets used to train Locutusque/llama-3-neural-chat-v1-8b

Space using Locutusque/llama-3-neural-chat-v1-8b 1

Evaluation results