Edit model card

Finetune Mistral, Gemma, Llama 2-5x faster with 70% less memory via Unsloth!

Directly quantized 4bit model with bitsandbytes. Built with Meta Llama 3

We have a Google Colab Tesla T4 notebook for Llama-3 8b here: https://colab.research.google.com/drive/135ced7oHytdxu3N2DNe1Z0kqjyYIkDXp?usp=sharing

✨ Finetune for Free

All notebooks are beginner friendly! Add your dataset, click "Run All", and you'll get a 2x faster finetuned model which can be exported to GGUF, vLLM or uploaded to Hugging Face.

Unsloth supports Free Notebooks Performance Memory use
Llama-3 8b ▢️ Start on Colab 2.4x faster 58% less
Gemma 7b ▢️ Start on Colab 2.4x faster 58% less
Mistral 7b ▢️ Start on Colab 2.2x faster 62% less
Llama-2 7b ▢️ Start on Colab 2.2x faster 43% less
TinyLlama ▢️ Start on Colab 3.9x faster 74% less
CodeLlama 34b A100 ▢️ Start on Colab 1.9x faster 27% less
Mistral 7b 1xT4 ▢️ Start on Kaggle 5x faster* 62% less
DPO - Zephyr ▢️ Start on Colab 1.9x faster 19% less
Downloads last month
504,834
Safetensors
Model size
4.65B params
Tensor type
F32
Β·
BF16
Β·
U8
Β·
Inference API
This model can be loaded on Inference API (serverless).

Spaces using unsloth/llama-3-8b-bnb-4bit 17

Collection including unsloth/llama-3-8b-bnb-4bit