ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published 4 days ago • 14
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild Paper • 2503.18892 • Published 5 days ago • 27
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 11 days ago • 108
Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning Paper • 2503.15558 • Published 10 days ago • 42
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published 17 days ago • 27
Gemini Embedding: Generalizable Embeddings from Gemini Paper • 2503.07891 • Published 18 days ago • 34
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published 19 days ago • 80
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published 18 days ago • 95
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published 23 days ago • 86
Self-rewarding correction for mathematical reasoning Paper • 2502.19613 • Published about 1 month ago • 82
Kanana: Compute-efficient Bilingual Language Models Paper • 2502.18934 • Published about 1 month ago • 64
Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence Paper • 2502.14905 • Published Feb 18 • 9
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues Paper • 2502.12084 • Published Feb 17 • 29
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 138
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published Feb 20 • 188
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 99