Edit model card

Convert color images to grayscale

See the corresponding discussion at https://github.com/lllyasviel/ControlNet/discussions/561 !

I have trained a ControlNet (214244a32 drop=0.5 mp=fp16 lr=1e-5) for 1.25 epochs by using a pointwise function to convert RGB to grayscale... which effectively makes it a pointless ControlNet 🤣

I wanted to see how fast it converges on a simple linear-transformation. To emphasize again: it doesn't colorize grayscale images, it desaturates color images... which you might as well do in an image editor. It's the most ineffective way to make grayscale images. But it lets us evaluate the model very easily and we can peer into the inner workings of ControlNet a bit. And it's also a good baseline for inpainting assuming 0% masking and tells us which artefacts to expect in the unmasked area. I chose drop=0.5 because I assumed the CN should pick up on "ignore the prompt"-task very fast, similar to the desaturation task, and it lets us compare the influence of prompts, and it keeps it comparable with inpainting. I don't think it would have converged faster without any prompts.

Training

accelerate launch train_controlnet.py \
  --pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" \
  --train_batch_size=4 \
  --gradient_accumulation_steps=8 \
  --proportion_empty_prompts=0.5
  --mixed_precision="fp16" \
  --learning_rate=1e-5 \
  --enable_xformers_memory_efficient_attention \
  --use_8bit_adam \
  --set_grads_to_none \
  --seed=0

Image dataset

  • laion2B-en aesthetics>=6.5 dataset
  • --min_image_size 512 --max_aspect_ratio 2 --resize_mode="center_crop" --image_size 512
  • Cleaned with fastdup default settings
  • Data augmented with right-left flipped images
  • Resulting in 214244 images
  • Converted to grayscale with cv2
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .