view article Article Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance By tiiuae and 5 others • 14 days ago • 25
HuggingFace's Transformers: State-of-the-art Natural Language Processing Paper • 1910.03771 • Published Oct 9, 2019 • 19
RecurrentGemma: Moving Past Transformers for Efficient Open Language Models Paper • 2404.07839 • Published Apr 11, 2024 • 48
view article Article Welcome to Inference Providers on the Hub 🔥 By julien-c and 6 others • Jan 28 • 482
view article Article SmolVLM Grows Smaller – Introducing the 250M & 500M Models! By andito and 2 others • Jan 23 • 180
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • Jan 23 • 68
view article Article Timm ❤️ Transformers: Use any timm model with transformers By ariG23498 and 4 others • Jan 16 • 50
view article Article Don't repeat yourself - 🤗 Transformers Design Philosophy By patrickvonplaten • Apr 5, 2022 • 34
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 775
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Apr 28 • 209
Canonical models Collection This collection lists all the historical (pre-"Hub") canonical model checkpoints, i.e. repos that were not under an org or user namespace • 68 items • Updated Feb 13, 2024 • 14