Kijai/flux-fp8 · Loading flux-fp8 with diffusers

Hello, I'm currently trying to load this model using diffusers. I have converted the original fp16 model to fp8 using the script below:

from safetensors.torch import load_file, save_file
import torch
import json

path = "flux1-dev.sft" # input file

# read safetensors metadata
def read_safetensors_metadata(path):
    with open(path, 'rb') as f:       
        header_size = int.from_bytes(f.read(8), 'little')
        header_json = f.read(header_size).decode('utf-8')
        header = json.loads(header_json)
        metadata = header.get('__metadata__', {})
        return metadata

metadata = read_safetensors_metadata(path)
print(json.dumps(metadata, indent=4)) #show metadata

sd_pruned = dict() #initialize empty dict

state_dict = load_file(path) #load safetensors file
for key in state_dict: #for each key in the safetensors file
    sd_pruned[key] = state_dict[key].to(torch.float8_e4m3fn) #convert to fp8

# save the pruned safetensors file
save_file(sd_pruned, "flux1-dev-fp8.safetensors", metadata={"format": "pt", **metadata})

However, I am not sure how I can use this newly saved model in diffusers.

import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("flux1-dev-fp8.safetensors", torch_dtype=torch.float8_e4m3fn, local_files_only=True)
pipe.enable_model_cpu_offload()

prompt = "A cat holding a sign that says hello world"
image = pipe(
    prompt,
    height=1024,
    width=1024,
    guidance_scale=3.5,
    output_type="pil",
    num_inference_steps=50,
    max_sequence_length=512,
    generator=torch.Generator("cpu").manual_seed(0)
).images[0]
image.save("flux-dev.png")

The code above results in an error output:

huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find an appropriate cached snapshot folder for the specified revision on the local disk and outgoing traffic has been disabled. To enable repo look-ups and downloads online, pass 'local_files_only=False' as input.

But, without using local_files_only kwarg it just attempts to download the model from huggingface.