@multimodalart on Hugging Face: "The first open Stable Diffusion 3-like architecture model is JUST out 💣

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

multimodalart

posted an update May 14

Post

6386

The first open Stable Diffusion 3-like architecture model is JUST out 💣 - but it is not SD3! 🤔

It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model 🖼️✨, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english 🤝 chinese understanding

Try it out by yourself here ▶️ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)

In the paper they claim to be SOTA open source based on human preference evaluation!

deleted

May 14

Ya just it's too slow 👍🏻💀

In this post

multimodalart Apolinário from multimodal AI art