Reinforcement Learning Finetunes Small Subnetworks in Large Language Models Paper • 2505.11711 • Published 24 days ago • 10
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others • 20 days ago • 144
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 127
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know By qgallouedec • Apr 18 • 37
Gemma 3 Collection All versions of Google's new multimodal models including QAT in 1B, 4B, 12B, and 27B sizes. In GGUF, dynamic 4-bit and 16-bit formats. • 50 items • Updated 11 days ago • 65
Running 2.67k 2.67k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
view article Article Open-R1: a fully open reproduction of DeepSeek-R1 By eliebak and 2 others • Jan 28 • 866