Aligned Novel View Image and Geometry Synthesis via Cross-modal Attention Instillation Paper • 2506.11924 • Published 12 days ago • 32
Fine-Grained Perturbation Guidance via Attention Head Selection Paper • 2506.10978 • Published 12 days ago • 26
Describe Anything Collection Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 7 days ago • 52
NormalCrafter: Learning Temporally Consistent Normals from Video Diffusion Priors Paper • 2504.11427 • Published Apr 15 • 19
UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics Paper • 2412.07774 • Published Dec 10, 2024 • 31