-
MM-IQ: Benchmarking Human-Like Abstraction and Reasoning in Multimodal Models
Paper • 2502.00698 • Published • 22 -
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models
Paper • 2502.01142 • Published • 21 -
ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
Paper • 2502.01100 • Published • 14 -
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles
Paper • 2502.01081 • Published • 12
Collections
Discover the best community collections!
Collections including paper arxiv:2502.03860
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 22 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 24 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 102 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4