CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Paper β’ 2502.07316 β’ Published 11 days ago β’ 42
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 17 days ago β’ 187
view article Article SmolVLM Grows Smaller β Introducing the 250M & 500M Models! about 1 month ago β’ 142
view article Article Fine-tune a SmolLM on domain-specific synthetic data from a LLM By davidberenstein1957 β’ Jan 3 β’ 34
SmolVLM Collection State-of-the-art compact VLMs for on-device applications: Base, Synthetic, and Instruct β’ 5 items β’ Updated 1 day ago β’ 34
π» Local SmolLMs Collection SmolLM models in MLC, ONNX and GGUF format for local applications + in-browser demos β’ 14 items β’ Updated 1 day ago β’ 49
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper β’ 2406.17557 β’ Published Jun 25, 2024 β’ 92
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper β’ 2405.18392 β’ Published May 28, 2024 β’ 12
Leaderboards and benchmarks β¨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... β’ 90 items β’ Updated 17 days ago β’ 96
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community β’ 17 items β’ Updated Jun 6, 2024 β’ 235
StarCoder 2 and The Stack v2: The Next Generation Paper β’ 2402.19173 β’ Published Feb 29, 2024 β’ 138
π« StarCoder2 Collection StarCoder2 models and datasets! β’ 8 items β’ Updated Mar 1, 2024 β’ 83