Submitted by akhaliq 14 Rethinking FID: Towards a Better Evaluation Metric for Image Generation · 6 authors 2
Submitted by akhaliq 14 Improving fine-grained understanding in image-text pre-training · 11 authors 1
Submitted by akhaliq 14 WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens · 6 authors 1
Submitted by akhaliq 11 SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild · 11 authors 1
Submitted by akhaliq 11 FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder · 5 authors 1
Submitted by akhaliq 6 CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects · 7 authors 1