gemma2-9B-GGUF / README.md
Deeokay's picture
Update README.md
77af8a3 verified
---
base_model: unsloth/gemma-2-9b-bnb-4bit
language:
- en
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- gemma2
- gguf
---
# Uploaded model
- **Developed by:** Deeokay
- **License:** apache-2.0
- **Finetuned from model :** unsloth/gemma-2-9b-bnb-4bit
This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
# README
This is a test model on a the following
- a private dataset
- slight customization on alpaca chat template
- Works with Ollama create but requires customization to Modelfile
- One reason for this was wanted to try doing Q2_K and see if it was actually good(?) -> Exceeds Expectation!!
- My examples will be based on unslot.Q2_K.GGUF file, however other quantization should work as well
# HOW TO USE
The whole point of conversion for me was I wanted to be able to to use it through Ollama or (other local options)
For Ollama, it required to be a GGUF file. Once you have this it is pretty straight forward
If you want to try it first, the Q2_K version of this is available in Ollama => deeokay/gemma2custom
```python
ollama pull deeokay/gemma2custom
```
# Quick Start:
- You must already have Ollama running in your setting
- Download the unsloth.Q2_K.gguf model from Files
- In the same directory create a file call "Modelfile"
- Inside the "Modelfile" type
```python
FROM ./GEMMA2_unsloth.Q2_K.gguf
TEMPLATE """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{{.Prompt}}
### Response:
"""
PARAMETER stop "<start_of_turn>"
PARAMETER stop "<end_of_turn>"
PARAMETER stop "<eos>"
PARAMETER stop "### "
PARAMETER stop "###: "
PARAMETER stop "###"
PARAMETER temperature 0.4
PARAMETER top_p 0.95
PARAMETER top_k 40
PARAMETER num_ctx 2048
SYSTEM """You are an AI teacher designed to educate young, curious minds. Your responses should be:
1. Accurate and precise, based on verified facts and data.
2. Age-appropriate and easily understandable.
3. Encouraging further exploration and learning.
4. Never containing made-up or speculative information.
5. Brief but comprehensive, covering key points without overwhelming detail.
If you're unsure about any information, clearly state that you don't have enough verified data to answer accurately."""
```
- Save a go back to the folder (folder where model + Modelfile exisit)
- Now in terminal make sure you are in the same location of the folder and type in the following command
```python
ollama create mycustomai # "mycustomai" <- you can name it anything u want
```
After than you should be able to use this model to chat!
This GGUF is based on Gemma2-9B-4bit by Unslot,
# NOTE: DISCLAIMER
Please note this is not for the purpose of production, but result of Fine Tuning through self learning
This is my Fine Tuning pass through with personalized customized dataset.
Please feel free to customize the Modelfile, and if you do get a better response than mine, please share!!
If would like to know how I started creating my dataset, you can check this link
[Crafting GPT2 for Personalized AI-Preparing Data the Long Way (Part1)](https://medium.com/@deeokay/the-soul-in-the-machine-crafting-gpt2-for-personalized-ai-9d38be3f635f)
As the data was getting created with custom GPT2 special tokens, I had to convert that to the a Alpaca Template.
However I got creative again.. the training data has the following Template:
```python
prompt = """Below is an instruction that describes a task, with an analysis that provides further context. Write a response, classification and sentiment that appropriately completes the request.
### Instruction:
{}
### Analysis:
{}
### Response:
{}
### Classification:
{}
### Sentiment:
{}
"""
EOS_TOKEN = tokenizer.eos_token # Must add EOS_TOKEN
def formatting_prompts_func(examples):
instructions = examples["Question"]
analyses = examples["Analysis"]
responses = examples["Answer"]
classifications = examples['Classification']
sentiments = examples['Sentiment']
texts = []
for instruction, analysis, response, classification, sentiment in zip(instructions, analyses, responses, classifications, sentiments):
# Must add EOS_TOKEN, otherwise your generation will go on forever!
text = prompt.format(instruction, analysis, response, classification, sentiment) + EOS_TOKEN
texts.append(text)
return { "text" : texts }
data_path = 'file to dataset'
df = pd.read_csv(data_path)
dataset = Dataset.from_pandas(df)
# Shuffle the dataset
shuffled_dataset = dataset.shuffle(seed=42) # Seed for reproducibility
dataset = dataset.map(formatting_prompts_func, batched = True,)
```
# SIDENOTE :
Because the Fine Tuned data way that way, you could technically try this and make it work..(?) still testing but please feel free to try as well (and do let me know). Still trying to figure this one out. This 'sort of' works but still tweeking it.
```python
FROM ./GEMMA2_unsloth.Q2_K.gguf
PARAMETER stop "<eos>"
PARAMETER stop "### Instruction:"
PARAMETER stop "### Analysis:"
PARAMETER stop "### Response:"
PARAMETER stop "### Classification:"
PARAMETER stop "### Sentiment:"
PARAMETER stop "###"
PARAMETER temperature 0.2
TEMPLATE """Below is an instruction that describes a task, with an analysis that provides further context. Write a response, classification and sentiment that appropriately completes the request.
### Instruction:
{{.Prompt}}
### Analysis:
### Response:
### Classification:
### Sentiment:
"""
SYSTEM """You are a helpful AI assistant that provides responses based on given instructions and analyses. Your responses should be appropriate, informative, and tailored to the context provided. Always follow the format specified in the prompt, providing a response, classification, and sentiment for each query."""
```
Will be updating this periodically.. as I have limited colab resources..