Edit model card

The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.

We trained this model for the Kyrgyz language using dataset linked.

Model Architecture

Mistral-7B-v0.1 is a transformer model, with the following architecture choices:

Grouped-Query Attention
Sliding-Window Attention
Byte-fallback BPE tokenizer

Troubleshooting

If you see the following error:

KeyError: 'mistral'

Or:

NotImplementedError: Cannot copy out of meta tensor; no data!

Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.

Notice

Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.

Downloads last month
13
Safetensors
Model size
7.24B params
Tensor type
FP16
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train UlutSoftLLC/Mistral-7B-v0.1-kyrgyz-text-completion