Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models Paper • 2506.01413 • Published 9 days ago • 15
MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs Paper • 2505.24858 • Published 11 days ago • 17
A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models Paper • 2505.07591 • Published 30 days ago • 10
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 170
Cost-of-Pass: An Economic Framework for Evaluating Language Models Paper • 2504.13359 • Published Apr 17 • 5
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval Paper • 2503.04644 • Published Mar 6 • 21
Language Models' Factuality Depends on the Language of Inquiry Paper • 2502.17955 • Published Feb 25 • 34
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12, 2024 • 126
SurveyX: Academic Survey Automation via Large Language Models Paper • 2502.14776 • Published Feb 20 • 100
Linear Correlation in LM's Compositional Generalization and Hallucination Paper • 2502.04520 • Published Feb 6 • 11
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published Feb 3 • 218
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 122
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published Jan 29 • 59