Training-Free Efficient Video Generation via Dynamic Token Carving Paper • 2505.16864 • Published 21 days ago • 21
Improved Diffusion-based Image Colorization via Piggybacked Models Paper • 2304.11105 • Published Apr 21, 2023
Video Colorization with Pre-trained Text-to-Image Diffusion Models Paper • 2306.01732 • Published Jun 2, 2023
Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation Paper • 2307.06940 • Published Jul 13, 2023 • 10
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT Paper • 2505.00703 • Published May 1 • 43
Lumina-Image 2.0: A Unified and Efficient Image Generative Framework Paper • 2503.21758 • Published Mar 27 • 22
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models Paper • 2503.05638 • Published Mar 7 • 19
MotionCanvas: Cinematic Shot Design with Controllable Image-to-Video Generation Paper • 2502.04299 • Published Feb 6 • 18
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published Jan 23 • 17
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation Paper • 2412.09428 • Published Dec 12, 2024 • 7
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection Paper • 2411.14794 • Published Nov 22, 2024 • 13
ViewCrafter: Taming Video Diffusion Models for High-fidelity Novel View Synthesis Paper • 2409.02048 • Published Sep 3, 2024 • 3
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT Paper • 2406.18583 • Published Jun 5, 2024
Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers Paper • 2405.05945 • Published May 9, 2024 • 3
Video Background Music Generation: Dataset, Method and Evaluation Paper • 2211.11248 • Published Nov 21, 2022
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT Paper • 2306.17103 • Published Jun 29, 2023 • 1