abhinavnmagic's picture
Update README.md
3991529 verified
|
raw
history blame
No virus
1.55 kB
metadata
tags:
  - fp8
  - vllm

Meta-Llama-3-8B-Instruct-FP8

Model Overview

Meta-Llama-3-8B-Instruct quantized to FP8 weights and activations using per-tensor quantization, ready for inference with vLLM >= 0.5.0.

Usage and Creation

Produced using AutoFP8 with calibration samples from ultrachat.

Evaluation

Open LLM Leaderboard evaluation scores

Meta-Llama-3-8B-Instruct Meta-Llama-3-8B-Instruct-FP8
(this model)
arc-c
25-shot
62.54 61.77
hellaswag
10-shot
78.83 78.56
mmlu
5-shot
66.60 66.27
truthfulqa
0-shot
52.44 52.35
winogrande
5-shot
75.93 76.4
gsm8k
5-shot
75.96 73.99
Average
Accuracy
68.71 68.22
Recovery 100% 99.28%