100 Days After DeepSeek-R1: A Survey on Replication Studies and More Directions for Reasoning Language Models Paper • 2505.00551 • Published May 1 • 37
MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly Paper • 2505.10610 • Published 26 days ago • 53
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published 21 days ago • 61
RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale Paper • 2505.03005 • Published May 5 • 32