view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy By medmekk and 5 others • Sep 18, 2024 • 255
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • 17 days ago • 102
view article Article 💥 Building a Vulnerable Bank MCP — Then Automating an Agent to Hack It By jdelavande and 2 others • 10 days ago • 8
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published 26 days ago • 104
view changelog Changelog Xet is now the default storage option for new users and organizations May 23 • 66
view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance By tngtech • Apr 16 • 18
view article Article Reduce, Reuse, Recycle: Why Open Source is a Win for Sustainability By sasha and 1 other • May 7 • 15
view article Article Falcon-Edge: A series of powerful, universal, fine-tunable 1.58bit language models. By tiiuae and 9 others • May 15 • 35
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference By mfuntowicz and 1 other • Jan 16 • 75