Edit model card

h2o-danube2-1.8b-chat-GGUF

Description

This repo contains GGUF format model files for h2o-danube2-1.8b-chat quantized using llama.cpp framework.

Table below summarizes different quantized versions of h2o-danube2-1.8b-chat. It shows the trade-off between size, speed and quality of the models.

Name Quant method Model size MT-Bench AVG Perplexity Tokens per second
h2o-danube2-1.8b-chat-F16.gguf F16 3.66 GB 5.60 8.02 797
h2o-danube2-1.8b-chat-Q8_0.gguf Q8_0 1.95 GB 5.51 8.02 1156
h2o-danube2-1.8b-chat-Q6_K.gguf Q6_K 1.50 GB 5.51 8.03 1131
h2o-danube2-1.8b-chat-Q5_K_M.gguf Q5_K_M 1.30 GB 5.56 8.10 1172
h2o-danube2-1.8b-chat-Q5_K_S.gguf Q5_K_S 1.27 GB 5.49 8.12 1107
h2o-danube2-1.8b-chat-Q4_K_M.gguf Q4_K_M 1.11 GB 5.60 8.27 1162
h2o-danube2-1.8b-chat-Q4_K_S.gguf Q4_K_S 1.06 GB 5.59 8.34 1270
h2o-danube2-1.8b-chat-Q3_K_L.gguf Q3_K_L 0.98 GB 5.23 8.72 1442
h2o-danube2-1.8b-chat-Q3_K_M.gguf Q3_K_M 0.91 GB 4.91 8.81 1107
h2o-danube2-1.8b-chat-Q3_K_S.gguf Q3_K_S 0.82 GB 4.03 10.12 1103
h2o-danube2-1.8b-chat-Q2_K.gguf Q2_K 0.71 GB 3.03 12.56 1160

Columns in the table are:

  • Name -- model name and link
  • Quant method -- quantization method
  • Model size -- size of the model in gigabytes
  • MT-Bench AVG -- MT-Bench benchmark score. The score is from 1 to 10, the higher, the better
  • Perplexity -- perplexity metric on WikiText-2 dataset. It's reported in a perplexity test from llama.cpp. The lower, the better
  • Tokens per second -- generation speed in tokens per second, as reported in a perplexity test from llama.cpp. The higher, the better. Speed tests are done on a single H100 GPU

Prompt template

<|prompt|>Why is drinking water so healthy?</s><|answer|>
Downloads last month
1,532
GGUF
Model size
1.83B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including h2oai/h2o-danube2-1.8b-chat-GGUF