Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 20 days ago • 60
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 41
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 41
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Paper • 2409.17481 • Published Sep 26, 2024 • 47
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 12 items • Updated 20 days ago • 60
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 58