Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity Paper • 2502.11901 • Published 4 days ago • 6
Dyve: Thinking Fast and Slow for Dynamic Process Verification Paper • 2502.11157 • Published 5 days ago • 6
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training Paper • 2502.11196 • Published 5 days ago • 20
SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering? Paper • 2502.12115 • Published 4 days ago • 41
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Paper • 2502.09390 • Published 9 days ago • 16
Tools for learning AI Collection This is a collection of tools on the hub that teachers and students can use to learn AI! • 9 items • Updated 5 days ago • 61
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach Paper • 2502.05171 • Published 14 days ago • 113
Great Models Think Alike and this Undermines AI Oversight Paper • 2502.04313 • Published 15 days ago • 29
Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Paper • 2502.03544 • Published 16 days ago • 42
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published 17 days ago • 55
Large Language Model Guided Self-Debugging Code Generation Paper • 2502.02928 • Published 17 days ago • 11
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 17 days ago • 187
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 23 days ago • 55
Towards General-Purpose Model-Free Reinforcement Learning Paper • 2501.16142 • Published 25 days ago • 26