stduhpf
/

SD3.5-Large-Turbo-GGUF-mixed-sdcpp

GGUF

Model card Files Files and versions Community

stduhpf commited on about 5 hours ago

Commit

4bc0828

•

1 Parent(s): 985e0c2

Update README.md

Browse files

Files changed (1) hide show

README.md +8 -5

README.md CHANGED Viewed

@@ -7,13 +7,16 @@ base_model:
 - stabilityai/stable-diffusion-3.5-large-turbo
 base_model_relation: quantized
 ---
 These models are made to work with [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) release [master-ac54e00](https://github.com/leejet/stable-diffusion.cpp/releases/tag/master-ac54e00) onwards. Support for other inference backends is not guarenteed.
 Quantized using this PR https://github.com/leejet/stable-diffusion.cpp/pull/447
-### Files:
-#### Mixed Types:
 - [sd3.5_large_turbo-q2_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q2_k_4_0.gguf): Smallest quantization yet. Use this if you can't afford anything bigger
 - [sd3.5_large_turbo-q3_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q3_k_4_0.gguf): Smaller than q4_0, acceptable degradation.
@@ -21,12 +24,12 @@ Quantized using this PR https://github.com/leejet/stable-diffusion.cpp/pull/447
 - [sd3.5_large_turbo-q4_k_4_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_4_1.gguf): Smaller than q4_1, and with comparable degradation. Recommended
 - [sd3.5_large_turbo-q4_k_5_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_5_0.gguf): Smaller than q5_0, and with comparable degradation. Recommended
-#### Legacy types:
 - [sd3.5_large_turbo-q4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q4_0.gguf): Same size as q4_k_4_0, Not recommended (use q4_k_4_0 instead)
-(I wanted to upload more, but it's not working anymore, maybe i hit a rate limit)
-### Outputs:
 | Name               | Image                            | Image                              | Image                              |
 | ------------------ | -------------------------------- | ---------------------------------- | ---------------------------------- |

 - stabilityai/stable-diffusion-3.5-large-turbo
 base_model_relation: quantized
 ---
+## Overview
 These models are made to work with [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) release [master-ac54e00](https://github.com/leejet/stable-diffusion.cpp/releases/tag/master-ac54e00) onwards. Support for other inference backends is not guarenteed.
 Quantized using this PR https://github.com/leejet/stable-diffusion.cpp/pull/447
+Normal K-quants are not working properly with SD3.5-Large models because over 90% of the weights are in tensors whose shape doesn't match the 256 superblock size of K-quants and can't be quantized. Mixing quantization types allow to take adventage of the better fidelity of k-quantto some extent while keeping the file size small.
+## Files:
+### Mixed Types:
 - [sd3.5_large_turbo-q2_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q2_k_4_0.gguf): Smallest quantization yet. Use this if you can't afford anything bigger
 - [sd3.5_large_turbo-q3_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q3_k_4_0.gguf): Smaller than q4_0, acceptable degradation.
 - [sd3.5_large_turbo-q4_k_4_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_4_1.gguf): Smaller than q4_1, and with comparable degradation. Recommended
 - [sd3.5_large_turbo-q4_k_5_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_5_0.gguf): Smaller than q5_0, and with comparable degradation. Recommended
+### Legacy types:
 - [sd3.5_large_turbo-q4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q4_0.gguf): Same size as q4_k_4_0, Not recommended (use q4_k_4_0 instead)
+- (I wanted to upload more, but it's not working anymore, maybe i hit a rate limit)
+## Outputs:
 | Name               | Image                            | Image                              | Image                              |
 | ------------------ | -------------------------------- | ---------------------------------- | ---------------------------------- |