|
--- |
|
base_model: black-forest-labs/FLUX.1-dev |
|
library_name: gguf |
|
license: other |
|
license_name: flux-1-dev-non-commercial-license |
|
license_link: LICENSE.md |
|
quantized_by: mo137 |
|
tags: |
|
- text-to-image |
|
- image-generation |
|
- flux |
|
--- |
|
|
|
Flux.1-dev in a few experimental custom formats, mixing tensors in **Q8_0**, **fp16**, and **fp32**. |
|
Converted from black-forest-labs' original bf16 weights. |
|
|
|
### Motivation |
|
Flux's weights were published in bf16. |
|
Conversion to fp16 is slightly lossy, but fp32 is lossless. |
|
I experimented with mixed tensor formats to see if it would improve quality. |
|
|
|
### Evaluation |
|
I tried comparing the outputs but I can't say with any certainty if these models are significantly better than pure Q8_0. |
|
You're probably better off using Q8_0, but I thought I'll share these – maybe someone will find them useful. |
|
|
|
Higher bits per weight (bpw) numbers result in slower computation: |
|
``` |
|
20 s Q8_0 |
|
23 s 11.0bpw-txt16 |
|
30 s fp16 |
|
37 s 16.4bpw-txt32 |
|
310 s fp32 |
|
``` |
|
|
|
In the txt16/32 files, I quantized only these layers to Q8_0, unless they were one-dimensional: |
|
``` |
|
img_mlp.0 |
|
img_mlp.2 |
|
img_mod.lin |
|
linear1 |
|
linear2 |
|
modulation.lin |
|
``` |
|
But left all these at fp16 or fp32, respectively: |
|
``` |
|
txt_mlp.0 |
|
txt_mlp.2 |
|
txt_mod.lin |
|
``` |
|
The resulting bpw number is just an approximation from file size. |
|
|
|
--- |
|
|
|
This is a direct GGUF conversion of [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main) |
|
|
|
As this is a quantized model not a finetune, all the same restrictions/original license terms still apply. |
|
|
|
The model files can be used with the [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) custom node. |
|
|
|
Place model files in `ComfyUI/models/unet` - see the GitHub readme for further install instructions. |
|
|
|
Please refer to [this chart](https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md#llama-3-8b-scoreboard) for a basic overview of quantization types. |
|
|
|
(Model card mostly copied from [city96/FLUX.1-dev-gguf](https://huggingface.co/city96/FLUX.1-dev-gguf) - which contains conventional and useful GGUF files.) |
|
|