-
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 100 -
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper • 2501.08313 • Published • 293 -
Towards Best Practices for Open Datasets for LLM Training
Paper • 2501.08365 • Published • 64 -
Qwen2.5-1M Technical Report
Paper • 2501.15383 • Published • 71
Collections
Discover the best community collections!
Collections including paper arxiv:2505.17667
-
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
Paper • 2501.04686 • Published • 53 -
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
Paper • 2501.09686 • Published • 41 -
LLaVA-o1: Let Vision Language Models Reason Step-by-Step
Paper • 2411.10440 • Published • 125 -
TheoremExplainAgent: Towards Multimodal Explanations for LLM Theorem Understanding
Paper • 2502.19400 • Published • 49
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 150 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 59 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 49