Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5 • 19
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 95
microsoft/table-transformer-structure-recognition Object Detection • 0.0B • Updated Sep 6, 2023 • 1.17M • 195
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 132
mistralai/Mistral-Small-3.1-24B-Instruct-2503 Image-Text-to-Text • 24B • Updated May 9 • 221k • • 1.29k
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path? Paper • 2502.15657 • Published Feb 21 • 5
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 160