view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 646
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • Jul 9 • 666
view article Article Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models By tiiuae and 8 others • Jul 4 • 9
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others • May 21 • 208
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • May 15 • 117
Grokking in the Wild: Data Augmentation for Real-World Multi-Hop Reasoning with Transformers Paper • 2504.20752 • Published Apr 29 • 93
view article Article Train your first Decision Transformer By edbeeching and 1 other • Sep 8, 2022 • 14
view article Article DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 211
view article Article What is test-time compute and how to scale it? By Kseniase and 1 other • Feb 6 • 103
view article Article Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers? By Kseniase and 1 other • Apr 4 • 14
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language By davidberenstein1957 and 5 others • Dec 16, 2024 • 137
view article Article Introducing RWKV — An RNN with the advantages of a transformer By BlinkDL and 3 others • May 15, 2023 • 23
FFN Fusion: Rethinking Sequential Computation in Large Language Models Paper • 2503.18908 • Published Mar 24 • 20
view article Article Open-Source Handwritten Signature Detection Model By samuellimabraz • Mar 14 • 118