UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation Paper • 2506.03147 • Published 3 days ago • 55 • 2
MAGREF: Masked Guidance for Any-Reference Video Generation Paper • 2505.23742 • Published 8 days ago • 9 • 2
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Paper • 2505.20292 • Published 11 days ago • 52 • 3
Sci-Fi: Symmetric Constraint for Frame Inbetweening Paper • 2505.21205 • Published 10 days ago • 5 • 2
ImgEdit: A Unified Image Editing Dataset and Benchmark Paper • 2505.20275 • Published 11 days ago • 17 • 3
OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation Paper • 2505.20292 • Published 11 days ago • 52 • 3
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation Paper • 2505.04512 • Published about 1 month ago • 35 • 3
Packing Input Frame Context in Next-Frame Prediction Models for Video Generation Paper • 2504.12626 • Published Apr 17 • 50 • 3
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published Apr 11 • 40 • 3
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper • 2504.08388 • Published Apr 11 • 40 • 3