Training-Free Text-Guided Color Editing with Multi-Modal Diffusion Transformer Paper • 2508.09131 • Published 11 days ago • 13
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation Paper • 2507.09862 • Published Jul 14 • 49
AgentAvatar: Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents Paper • 2311.17465 • Published Nov 29, 2023
Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data Paper • 2311.18729 • Published Nov 30, 2023
PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns Paper • 2312.04534 • Published Dec 7, 2023 • 6
Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer Paper • 2403.13570 • Published Mar 20, 2024 • 3
Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors Paper • 2212.04248 • Published Dec 7, 2022
Taming Teacher Forcing for Masked Autoregressive Video Generation Paper • 2501.12389 • Published Jan 21 • 10