Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

26,789

Full-text search

Active filters: 8-bit

LGAI-EXAONE/EXAONE-4.0-1.2B-GPTQ-Int8

Text Generation • 0.5B • Updated about 14 hours ago • 66 • 7

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 9.58k • 1.13k

mlx-community/Jan-nano-8bit

Text Generation • 1B • Updated Jun 16 • 455 • 5

Qwen/Qwen3-32B-MLX-8bit

Text Generation • 9B • Updated 12 days ago • 385 • 7

mlx-community/Kimi-Dev-72B-8bit

Text Generation • 73B • Updated 6 days ago • 736 • 2

MaziyarPanahi/ChatMusician-GGUF

Text Generation • 7B • Updated Mar 2, 2024 • 259 • 14

MaziyarPanahi/Mixtral-8x22B-v0.1-GGUF

Text Generation • 141B • Updated Apr 15, 2024 • 190k • 75

MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF

Text Generation • 7B • Updated May 22, 2024 • 222k • 107

MaziyarPanahi/Mistral-Nemo-Instruct-2407-GGUF

Text Generation • 12B • Updated Jul 22, 2024 • 194k • 47

MaziyarPanahi/Meta-Llama-3.1-70B-Instruct-GGUF

Text Generation • 71B • Updated Jul 29, 2024 • 186k • 40

HF1BitLLM/Llama3-8B-1.58-100B-tokens

Text Generation • 3B • Updated Sep 19, 2024 • 1.84k • 191

brunopio/Llama3-8B-1.58-100B-tokens-GGUF

Text Generation • 3B • Updated Sep 19, 2024 • 1.81k • 17

MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF

Text Generation • 1B • Updated Sep 25, 2024 • 193k • 13

mlx-community/Llama-3.3-70B-Instruct-8bit

Text Generation • 20B • Updated Dec 6, 2024 • 2.86k • 13

tiiuae/Falcon3-10B-Instruct-1.58bit

Text Generation • 3B • Updated Jan 13 • 1.07k • 20

MaziyarPanahi/Lumimaid-Magnum-v4-12B-GGUF

Text Generation • 12B • Updated Dec 23, 2024 • 26 • 1

MaziyarPanahi/Mistral-Small-24B-Instruct-2501-GGUF

Text Generation • 24B • Updated 26 days ago • 192k • 6

EliasOenal/Mistral-Small-24B-Instruct-2501-W8A8-dynamic

Text Generation • 24B • Updated Feb 12 • 6 • 3

MaziyarPanahi/Phi-4-mini-instruct-GGUF

Text Generation • 4B • Updated Mar 1 • 190k • 5

mlx-community/gemma-3-1b-it-8bit

Text Generation • 0.4B • Updated Mar 12 • 1.3k • 3

RichardErkhov/CodeGPTPlus_-_deepseek-coder-1.3b-typescript-8bits

1B • Updated Mar 23 • 1 • 1

MaziyarPanahi/Qwen3-4B-GGUF

Text Generation • 4B • Updated Apr 28 • 188k • 3

lmstudio-community/DeepSeek-R1-0528-Qwen3-8B-MLX-8bit

Text Generation • 2B • Updated May 29 • 452k • 6

MaziyarPanahi/DeepSeek-R1-0528-Qwen3-8B-GGUF

Text Generation • 8B • Updated May 29 • 187k • 5

RedHatAI/gemma-3-27b-it-quantized.w8a8

Image-Text-to-Text • 29B • Updated Jun 9 • 1.72k • 7

mlx-community/Dolphin-Mistral-24B-Venice-Edition-mlx-8Bit

7B • Updated 29 days ago • 717 • 2

RedHatAI/Qwen3-32B-NVFP4

Text Generation • 19B • Updated 18 days ago • 894 • 1

baidu/ERNIE-4.5-300B-A47B-FP8-Paddle

Text Generation • 300B • Updated 15 days ago • 71 • 14

mlx-community/SmolLM3-3B-8bit

Text Generation • 0.9B • Updated 10 days ago • 453 • 6

lmstudio-community/SmolLM3-3B-MLX-8bit

Text Generation • 0.9B • Updated 9 days ago • 770 • 2