mo137 commited on
Commit
4f2dd61
1 Parent(s): 1dcb431

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -5
README.md CHANGED
@@ -1,5 +1,64 @@
1
- ---
2
- license: other
3
- license_name: flux-1-dev-non-commercial-license
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: black-forest-labs/FLUX.1-dev
3
+ library_name: gguf
4
+ license: other
5
+ license_name: flux-1-dev-non-commercial-license
6
+ license_link: LICENSE.md
7
+ quantized_by: mo137
8
+ tags:
9
+ - text-to-image
10
+ - image-generation
11
+ - flux
12
+ ---
13
+
14
+ Flux.1-dev in a few experimental custom formats, mixing tensors in **Q8_0**, **fp16**, and **fp32**.
15
+ Converted from black-forest-labs' original bf16 weights.
16
+
17
+ ### Motivation
18
+ Flux's weights were published in bf16.
19
+ Conversion to fp16 is slightly lossy, but fp32 is lossless.
20
+ I experimented with mixed tensor formats to see if it would improve quality.
21
+
22
+ ### Evaluation
23
+ I tried comparing the outputs but I can't say with any certainty if these models are significantly better than pure Q8_0.
24
+ You're probably better off using Q8_0, but I thought I'll share these – maybe someone will find them useful.
25
+
26
+ Higher bits per weight (bpw) numbers result in slower computation:
27
+ ```
28
+ 20 s Q8_0
29
+ 23 s 11.0bpw-txt16
30
+ 30 s fp16
31
+ 37 s 16.4bpw-txt32
32
+ 310 s fp32
33
+ ```
34
+
35
+ In the txt16/32 files, I quantized only these layers to Q8_0, unless they were one-dimensional:
36
+ ```
37
+ img_mlp.0
38
+ img_mlp.2
39
+ img_mod.lin
40
+ linear1
41
+ linear2
42
+ modulation.lin
43
+ ```
44
+ But left all these at fp16 or fp32, respectively:
45
+ ```
46
+ txt_mlp.0
47
+ txt_mlp.2
48
+ txt_mod.lin
49
+ ```
50
+ The resulting bpw number is just an approximation from file size.
51
+
52
+ ---
53
+
54
+ This is a direct GGUF conversion of [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main)
55
+
56
+ As this is a quantized model not a finetune, all the same restrictions/original license terms still apply.
57
+
58
+ The model files can be used with the [ComfyUI-GGUF](https://github.com/city96/ComfyUI-GGUF) custom node.
59
+
60
+ Place model files in `ComfyUI/models/unet` - see the GitHub readme for further install instructions.
61
+
62
+ Please refer to [this chart](https://github.com/ggerganov/llama.cpp/blob/master/examples/perplexity/README.md#llama-3-8b-scoreboard) for a basic overview of quantization types.
63
+
64
+ (Model card mostly copied from [city96/FLUX.1-dev-gguf](https://huggingface.co/city96/FLUX.1-dev-gguf) - which contains conventional and useful GGUF files.)