Sleeping Featured 21 On-Device LLM Throughput Calculator 🚀 21 Generate throughput plot for LLMs on devices
Anemll converted models Collection Preconverted models for https://github.com/Anemll/Anemll. ctx = context, 0.1.x = converted with Anemll v. 0.1.x. x = 1 & 2 are equal model wise • 6 items • Updated Feb 19 • 3
NousResearch/DeepHermes-3-Llama-3-8B-Preview Text Generation • 8B • Updated Apr 10 • 305 • • 354
view article Article Fine-tuning LLMs to 1.58bit: extreme quantization made easy +4 Sep 18, 2024 • 272
enterprise-explorers/Llama-2-7b-chat-coreml Text Generation • Updated Jul 18, 2023 • 1.22k • 138