MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning Paper • 2601.21468 • Published 5 days ago • 10
THINKSAFE: Self-Generated Safety Alignment for Reasoning Models Paper • 2601.23143 • Published 4 days ago • 38
PaperBanana: Automating Academic Illustration for AI Scientists Paper • 2601.23265 • Published 4 days ago • 54
DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment Paper • 2601.20218 • Published 7 days ago • 14
ASTRA: Automated Synthesis of agentic Trajectories and Reinforcement Arenas Paper • 2601.21558 • Published 5 days ago • 53
TTCS: Test-Time Curriculum Synthesis for Self-Evolving Paper • 2601.22628 • Published 4 days ago • 29
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published 5 days ago • 14
Linear representations in language models can change dramatically over a conversation Paper • 2601.20834 • Published 6 days ago • 21
CooperBench: Why Coding Agents Cannot be Your Teammates Yet Paper • 2601.13295 • Published 15 days ago • 3
Towards Pixel-Level VLM Perception via Simple Points Prediction Paper • 2601.19228 • Published 7 days ago • 16
AdaReasoner: Dynamic Tool Orchestration for Iterative Visual Reasoning Paper • 2601.18631 • Published 8 days ago • 47
Harder Is Better: Boosting Mathematical Reasoning via Difficulty-Aware GRPO and Multi-Aspect Question Reformulation Paper • 2601.20614 • Published 6 days ago • 115
FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning Paper • 2601.18150 • Published 8 days ago • 6
Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning Paper • 2601.20209 • Published 7 days ago • 22
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery Paper • 2601.19325 • Published 7 days ago • 76