CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model Paper • 2310.15477 • Published Oct 24, 2023
Critical Data Size of Language Models from a Grokking Perspective Paper • 2401.10463 • Published Jan 19, 2024 • 1
UltraMedical: Building Specialized Generalists in Biomedicine Paper • 2406.03949 • Published Jun 6, 2024
MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding Paper • 2501.18362 • Published Jan 30 • 23
DriveMoE: Mixture-of-Experts for Vision-Language-Action Model in End-to-End Autonomous Driving Paper • 2505.16278 • Published May 22 • 5
Towards a Unified View of Large Language Model Post-Training Paper • 2509.04419 • Published Sep 4 • 73
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 183
SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning Paper • 2509.09674 • Published Sep 11 • 77
FlowRL: Matching Reward Distributions for LLM Reasoning Paper • 2509.15207 • Published about 1 month ago • 108
Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space Paper • 2505.13308 • Published May 19 • 27
Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models Paper • 2503.11224 • Published Mar 14 • 28
PaD: Program-aided Distillation Specializes Large Models in Reasoning Paper • 2305.13888 • Published May 23, 2023