view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 15 days ago • 90
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 134
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency Jan 30, 2025 • 208
TorchAO: PyTorch-Native Training-to-Serving Model Optimization Paper • 2507.16099 • Published Jul 21, 2025 • 7
view article Article AI Policy @🤗: Response to the 2025 National AI R&D Strategic Plan Jun 2, 2025 • 14
POTION Collection These are the flagship POTION models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 6 items • Updated Nov 13, 2025 • 14
view article Article Blazingly fast whisper transcriptions with Inference Endpoints +4 May 13, 2025 • 81
Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated Apr 10, 2025 • 108
view article Article From Llasa to Llasagna 🍕: Finetuning LLaSA to generates Italian speech and other languages Feb 11, 2025 • 33
Danish Text Datasets Collection These include high-quality Danish text datasets for pre-training, fine-tuning, etc. • 16 items • Updated Dec 15, 2024 • 3