view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • May 15 • 114
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 29 days ago • 201
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub By jsulz and 3 others • Feb 12 • 66
llama.vim Collection Recommended models for the llama.vim and llama.vscode plugins • 9 items • Updated May 14 • 40
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 40 items • Updated 4 days ago • 118
Llama 3.2 3B & 1B GGUF Quants Collection Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models. • 4 items • Updated Sep 26, 2024 • 46