Temporal Preference Optimization for Long-Form Video Understanding Paper ā¢ 2501.13919 ā¢ Published Jan 23 ā¢ 22 ā¢ 3
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper ā¢ 2501.13920 ā¢ Published Jan 23 ā¢ 15 ā¢ 2
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper ā¢ 2501.09751 ā¢ Published Jan 16 ā¢ 47 ā¢ 2
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Paper ā¢ 2501.09012 ā¢ Published Jan 15 ā¢ 10 ā¢ 2
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper ā¢ 2501.08828 ā¢ Published Jan 15 ā¢ 30 ā¢ 2
Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion Paper ā¢ 2501.09019 ā¢ Published Jan 15 ā¢ 12 ā¢ 2
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper ā¢ 2501.08292 ā¢ Published Jan 14 ā¢ 17 ā¢ 2
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper ā¢ 2501.01427 ā¢ Published Jan 2 ā¢ 51 ā¢ 3