S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published 4 days ago • 15
LLM-based User Profile Management for Recommender System Paper • 2502.14541 • Published 1 day ago • 4
Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language Models Paper • 2502.14191 • Published 2 days ago • 1
CLIPPER: Compression enables long-context synthetic data generation Paper • 2502.14854 • Published 1 day ago • 3
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published 1 day ago • 24
LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models Paper • 2502.14834 • Published 1 day ago • 19
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC Paper • 2502.14282 • Published 2 days ago • 12
MLGym: A New Framework and Benchmark for Advancing AI Research Agents Paper • 2502.14499 • Published 1 day ago • 124
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation Paper • 2502.14846 • Published 1 day ago • 9
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO Paper • 2502.14669 • Published 1 day ago • 5
RelaCtrl: Relevance-Guided Efficient Control for Diffusion Transformers Paper • 2502.14377 • Published 1 day ago • 4
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 1 day ago • 81
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 1 day ago • 82
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions Paper • 2502.13791 • Published 3 days ago • 4
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models Paper • 2502.13533 • Published 3 days ago • 5