ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning Paper • 2503.19470 • Published 3 days ago • 12
Open Deep Search: Democratizing Search with Open-source Reasoning Agents Paper • 2503.20201 • Published 2 days ago • 22
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking Paper • 2503.19855 • Published 2 days ago • 21
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published 7 days ago • 43
view article Article NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets 10 days ago • 29
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond Paper • 2503.10460 • Published 14 days ago • 27
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20, 2024 • 76
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published Feb 13 • 34
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 270
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions Paper • 2411.14405 • Published Nov 21, 2024 • 61
steiner-preview Collection Reasoning models trained on synthetic data using reinforcement learning. • 3 items • Updated Oct 20, 2024 • 32