GGUF quants of gghfez/gemma-2-27b-rp-c2-v2 Finetune of the gemma2-27b base model.

All quants have FP16 input tensors + output weights. I found quantizing these degraded the quality significantly.

gemma-2-27b-rp-c2-v2.IQ4_XSl.gguf - fits into 16GB VRAM with 16k context

Changes since V1:

Filtered junk out of the dataset
prepended to chatml template (so called gemma_chatml)

I've been using the I14_XSl quant with SillyTavern. The latest SillyTavern has a 'gemma2' template which matches the training, but chatml works fine for me.

Seems to work pretty well with SillyTavern

Prompting

Model has been Instruct tuned with the Gemma_ChatML formatting. A typical input would look like this:

Training:

Trained on a subset of the synthetic RP dataset from: Sao10K/c2-Logs-Filtered