view article Article Cohere on Hugging Face Inference Providers š„ By burtenshaw and 6 others ⢠Apr 16 ⢠127
view article Article Making LLMs Smaller Without Breaking Them: A GLU-Aware Pruning Approach By oopere ⢠Nov 24, 2024 ⢠8
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. ⢠9 items ⢠Updated Feb 7 ⢠174
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit ⢠28 items ⢠Updated 12 days ago ⢠83