TransMamba: Flexibly Switching between Transformer and Mamba Paper • 2503.24067 • Published 19 days ago • 17
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 20 days ago • 93
Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners Paper • 2502.20339 • Published Feb 27 • 2
Hogwild! Inference: Parallel LLM Generation via Concurrent Attention Paper • 2504.06261 • Published 11 days ago • 101
Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers? Paper • 2503.10632 • Published Mar 13 • 14
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 276
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges Paper • 2401.07339 • Published Jan 14, 2024 • 1
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion Paper • 2503.16212 • Published 30 days ago • 22
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts Paper • 2407.03203 • Published Jul 3, 2024 • 12