Quantization made by Richard Erkhov. [Github](https://github.com/RichardErkhov) [Discord](https://discord.gg/pvy7H8DZMG) [Request more models](https://github.com/RichardErkhov/quant_request) Fireball-MathMistral-Nemo-Base-2407 - GGUF - Model creator: https://huggingface.co/EpistemeAI/ - Original model: https://huggingface.co/EpistemeAI/Fireball-MathMistral-Nemo-Base-2407/ | Name | Quant method | Size | | ---- | ---- | ---- | | [Fireball-MathMistral-Nemo-Base-2407.Q2_K.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q2_K.gguf) | Q2_K | 4.46GB | | [Fireball-MathMistral-Nemo-Base-2407.IQ3_XS.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.IQ3_XS.gguf) | IQ3_XS | 4.94GB | | [Fireball-MathMistral-Nemo-Base-2407.IQ3_S.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.IQ3_S.gguf) | IQ3_S | 5.18GB | | [Fireball-MathMistral-Nemo-Base-2407.Q3_K_S.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q3_K_S.gguf) | Q3_K_S | 5.15GB | | [Fireball-MathMistral-Nemo-Base-2407.IQ3_M.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.IQ3_M.gguf) | IQ3_M | 5.33GB | | [Fireball-MathMistral-Nemo-Base-2407.Q3_K.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q3_K.gguf) | Q3_K | 5.67GB | | [Fireball-MathMistral-Nemo-Base-2407.Q3_K_M.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q3_K_M.gguf) | Q3_K_M | 5.67GB | | [Fireball-MathMistral-Nemo-Base-2407.Q3_K_L.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q3_K_L.gguf) | Q3_K_L | 6.11GB | | [Fireball-MathMistral-Nemo-Base-2407.IQ4_XS.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.IQ4_XS.gguf) | IQ4_XS | 6.33GB | | [Fireball-MathMistral-Nemo-Base-2407.Q4_0.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q4_0.gguf) | Q4_0 | 6.59GB | | [Fireball-MathMistral-Nemo-Base-2407.IQ4_NL.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.IQ4_NL.gguf) | IQ4_NL | 6.65GB | | [Fireball-MathMistral-Nemo-Base-2407.Q4_K_S.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q4_K_S.gguf) | Q4_K_S | 6.63GB | | [Fireball-MathMistral-Nemo-Base-2407.Q4_K.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q4_K.gguf) | Q4_K | 6.96GB | | [Fireball-MathMistral-Nemo-Base-2407.Q4_K_M.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q4_K_M.gguf) | Q4_K_M | 6.96GB | | [Fireball-MathMistral-Nemo-Base-2407.Q4_1.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q4_1.gguf) | Q4_1 | 7.26GB | | [Fireball-MathMistral-Nemo-Base-2407.Q5_0.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q5_0.gguf) | Q5_0 | 7.93GB | | [Fireball-MathMistral-Nemo-Base-2407.Q5_K_S.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q5_K_S.gguf) | Q5_K_S | 7.93GB | | [Fireball-MathMistral-Nemo-Base-2407.Q5_K.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q5_K.gguf) | Q5_K | 8.13GB | | [Fireball-MathMistral-Nemo-Base-2407.Q5_K_M.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q5_K_M.gguf) | Q5_K_M | 8.13GB | | [Fireball-MathMistral-Nemo-Base-2407.Q5_1.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q5_1.gguf) | Q5_1 | 8.61GB | | [Fireball-MathMistral-Nemo-Base-2407.Q6_K.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q6_K.gguf) | Q6_K | 9.37GB | | [Fireball-MathMistral-Nemo-Base-2407.Q8_0.gguf](https://huggingface.co/RichardErkhov/EpistemeAI_-_Fireball-MathMistral-Nemo-Base-2407-gguf/blob/main/Fireball-MathMistral-Nemo-Base-2407.Q8_0.gguf) | Q8_0 | 12.13GB | Original model description: --- base_model: unsloth/Mistral-Nemo-Base-2407-bnb-4bit language: - en license: apache-2.0 tags: - text-generation-inference - transformers - unsloth - mistral - trl datasets: - meta-math/MetaMathQA --- # Uploaded model - **Developed by:** EpistemeAI - **License:** apache-2.0 - **Finetuned from model :** unsloth/Mistral-Nemo-Base-2407-bnb-4bit - This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth) # Fireball-MathMistral-Nemo-Base-2407 This model is fine-tune to provide better math response than Mistral-Nemo-Base-2407 ## Training Dataset Supervised fine-tuning with datasets with meta-math/MetaMathQA This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. [](https://github.com/unslothai/unsloth) # Model Card for Mistral-Nemo-Base-2407 The Fireball-MathMistral-Nemo-Base-2407 Large Language Model (LLM) is a pretrained generative text model of 12B parameters, it significantly outperforms existing models smaller or similar in size. For more details about this model please refer to our release [blog post](https://mistral.ai/news/mistral-nemo/). ## Key features - Released under the **Apache 2 License** - Trained with a **128k context window** - Trained on a large proportion of **multilingual and code data** - Drop-in replacement of Mistral 7B ## Model Architecture Mistral Nemo is a transformer model, with the following architecture choices: - **Layers:** 40 - **Dim:** 5,120 - **Head dim:** 128 - **Hidden dim:** 14,436 - **Activation Function:** SwiGLU - **Number of heads:** 32 - **Number of kv-heads:** 8 (GQA) - **Vocabulary size:** 2**17 ~= 128k - **Rotary embeddings (theta = 1M)** #### Demo After installing `mistral_inference`, a `mistral-demo` CLI command should be available in your environment. ### Transformers > [!IMPORTANT] > NOTE: Until a new release has been made, you need to install transformers from source: > ```sh > pip install git+https://github.com/huggingface/transformers.git > ``` If you want to use Hugging Face `transformers` to generate text, you can do something like this. ```py from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "EpistemeAI/Fireball-MathMistral-Nemo-Base-2407" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained(model_id) inputs = tokenizer("Hello my name is", return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=20) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` > [!TIP] > Unlike previous Mistral models, Mistral Nemo requires smaller temperatures. We recommend to use a temperature of 0.3. ## Note `Mistral-Nemo-Base-2407` is a pretrained base model and therefore does not have any moderation mechanisms.