Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency Paper • 2504.18589 • Published Apr 24 • 13 • 3
What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations Paper • 2502.08279 • Published Feb 12 • 1
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper • 2505.10610 • Published May 15 • 54
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper • 2505.10610 • Published May 15 • 54 • 3
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper • 2505.10610 • Published May 15 • 54
PosterSum: A Multimodal Benchmark for Scientific Poster Summarization Paper • 2502.17540 • Published Feb 24 • 3
Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs Paper • 2502.05092 • Published Feb 7 • 8
PosterSum: A Multimodal Benchmark for Scientific Poster Summarization Paper • 2502.17540 • Published Feb 24 • 3
PosterSum: A Multimodal Benchmark for Scientific Poster Summarization Paper • 2502.17540 • Published Feb 24 • 3 • 2
Lost in Time: Clock and Calendar Understanding Challenges in Multimodal LLMs Paper • 2502.05092 • Published Feb 7 • 8 • 4