Edit model card

Want to make a couple models: ( • ̀ω•́ )✧

Downloading the original model, quantizing the GGUF-F16 model, creating imatrix.dat, quantizing the required models and then uploading them to the repository. In total, 4 and a half hours were spent: 💀

(It’s a pity I don’t have normal hardware for quantization)

Link to original model and script:

Downloads last month
16
GGUF
Model size
13B params
Architecture
llama

4-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.