16ch-vae / README.md
isidentical's picture
Update README.md
518cefd verified
metadata
license: cc
library_name: diffusers
tags:
  - art
model-index:
  - name: 16ch-VAE
    results:
      - task:
          type: encoder-loss
        dataset:
          name: yerevann/coco-karpathy
          type: image
        metrics:
          - name: PSNR
            type: PSNR
            value: 31.5151

16ch-VAE

Disclaimer: this VAE is not intended to be a replacement for SD3's VAE since the latent spaces are entirely different.

A fully open source 16ch VAE reproduction for the SD3. Useful for people who are building their own image generation models and need an off-the-shelf VAE. Natively trained in fp16.

VAE rFID PSNR LPIPS
SD1.5 VAE 0.3131 26.4332 0.0328
SDXL VAE 0.3511 26.7577 0.032
SD3 VAE 0.0257 30.3231 0.0132
16ch-VAE 0.0667 31.5151 0.0136
16ch-VAE with FFT* 0.1584 31.0542 0.0281

Usage

Awaiting https://github.com/huggingface/diffusers/pull/8769 in diffusers!