Edit model card

GGUF quants of gghfez/gemma-2-27b-rp-c2-v2 Finetune of the gemma2-27b base model.

All quants have FP16 input tensors + output weights. I found quantizing these degraded the quality significantly.

gemma-2-27b-rp-c2-v2.IQ4_XSl.gguf - fits into 16GB VRAM with 16k context

Changes since V1:

  • Filtered junk out of the dataset
  • prepended to chatml template (so called gemma_chatml)

I've been using the I14_XSl quant with SillyTavern. The latest SillyTavern has a 'gemma2' template which matches the training, but chatml works fine for me.

Seems to work pretty well with SillyTavern

Prompting

Model has been Instruct tuned with the Gemma_ChatML formatting. A typical input would look like this:

<|im_start|>user Hi there!<|im_end|> <|im_start|>assistant Nice to meet you!<|im_end|> <|im_start|>user Can I ask a question?<|im_end|> <|im_start|>assistant

Training:

Trained on a subset of the synthetic RP dataset from: Sao10K/c2-Logs-Filtered

Downloads last month
93
GGUF
Model size
27.2B params
Architecture
gemma2

4-bit

Inference API
Unable to determine this model's library. Check the docs .