dverdu-freepik's picture
fix: Update readme
d2d5340
|
raw
history blame
4.04 kB
metadata
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/LICENSE.md
base_model:
  - black-forest-labs/FLUX.1-dev
pipeline_tag: text-to-image
library_name: diffusers
tags:
  - flux
  - text-to-image

Flux.1 Lite

Flux.1 Lite

We are thrilled to announce the alpha release of Flux.1 Lite, an 8B parameter transformer model distilled from the FLUX.1-dev model.

Our goal? To distill FLUX.1-dev into a lighter model, reducing the parameters to just 24 GB, so it can run smoothly on most consumer-grade GPU cards, making high-quality AI models accessible to everyone.

Flux.1 Lite vs FLUX.1-dev

Motivation

As stated by other members of the community like Ostris, it seems that blocks of the Flux1.dev transformer have a different contribution to the final image generation. To explore this, we analyzed the Mean Squared Error (MSE) between the input and output of each block, revealing significant variability.

Our findings? Not all blocks are created equal. By strategically skipping less impactful blocks, we've managed to achieve incredible efficiency gains without compromising on quality. The results are striking: skipping just one of the early MMDIT blocks can significantly impact model performance, whereas other blocks have a much smaller effect.

Flux.1 Lite generated image MSE MMDIT MSE DIT

Furthermore, as displayed in the following image, only when you skip one of the first MMDIT blocks, the performance of the model severely impacts the model's performance. Skip one MMDIT block Skip one DIT block

Text-to-Image Usage

Flux.1 Lite is ready to unleash your creativity! For the best results, we recommend using a guidance_scale of 3.5 and setting n_steps between 22 and 30.

import torch
from diffusers import FluxPipeline

base_model_id = "Freepik/flux.1-lite-8B-alpha"
torch_dtype = torch.bfloat16
device = "cuda"

# Load the pipe
model_id = "Freepik/flux.1-lite-8B-alpha"
pipe = FluxPipeline.from_pretrained(
    model_id, torch_dtype=torch_dtype
).to(device)

# Inference
prompt = "A close-up image of a green alien with fluorescent skin in the middle of a dark purple forest"

guidance_scale = 3.5  # Keep guidance_scale at 3.5
n_steps = 28
seed = 11

with torch.inference_mode():
    image = pipe(
        prompt=prompt,
        generator=torch.Generator(device="cpu").manual_seed(seed),
        num_inference_steps=n_steps,
        guidance_scale=guidance_scale,
        height=1024,s
        width=1024,
    ).images[0]
image.save("output.png")

ComfyUI

We've also crafted a ComfyUI workflow to make using Flux.1 Lite even more seamless! Find it in comfy/flux.1-lite_workflow.json. ComfyUI workflow

Checkpoints

  • flux.1-lite-8B-alpha.safetensors: Transformer checkpoint, in Flux original format.
  • transformers/: Contains distilled 8B transformer model, in diffusers format.

🤗 Hugging Face space:

Flux.1 Lite demo host on 🤗 flux.1-lite

🔥 News 🔥

Citation

If you find our work helpful, please cite it!

@article{flux1-lite,
  title={Flux.1 Lite: Distilling Flux1.dev for Efficient Text-to-Image Generation},
  author={Daniel Verdú, Javier Martín},
  email={dverdu@freepik.com, javier.martin@freepik.com},
  year={2024},
}