EPiC: Efficient Video Camera Control Learning with Precise Anchor-Video Guidance Paper • 2505.21876 • Published May 28 • 9
Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning Paper • 2506.03525 • Published Jun 4 • 6
CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval Paper • 2506.06144 • Published Jun 6
A Survey on Long-Video Storytelling Generation: Architectures, Consistency, and Cinematic Quality Paper • 2507.07202 • Published 15 days ago • 22
LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning Paper • 2206.06522 • Published Jun 13, 2022
Training-free Guidance in Text-to-Video Generation via Multimodal Planning and Structured Noise Initialization Paper • 2504.08641 • Published Apr 11 • 7
CAPTURe: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting Paper • 2504.15485 • Published Apr 21 • 5
Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems Paper • 2504.09763 • Published Apr 14 • 13
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Paper • 2411.04952 • Published Nov 7, 2024 • 30
VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement Paper • 2411.15115 • Published Nov 22, 2024 • 9
Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation Paper • 2304.06671 • Published Apr 13, 2023
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback Paper • 2410.06215 • Published Oct 8, 2024
Self-Chained Image-Language Model for Video Localization and Question Answering Paper • 2305.06988 • Published May 11, 2023
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Models Paper • 2202.04053 • Published Feb 8, 2022
Visual Programming for Text-to-Image Generation and Evaluation Paper • 2305.15328 • Published May 24, 2023
VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks Paper • 2112.06825 • Published Dec 13, 2021