Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

220

Full-text search

Active filters: llama.cpp

ReallyFloppyPenguin/Dhanishtha-2.0-preview-GGUF

Updated 13 days ago

ReallyFloppyPenguin/DeepSeek-R1-Distill-Qwen-1.5B-GGUF

2B • Updated 13 days ago • 114

ReallyFloppyPenguin/DeepSWE-Preview-GGUF

33B • Updated 13 days ago • 99

JonathanMiddleton/Qwen3-Embedding-8B-GGUF

8B • Updated 8 days ago • 180

Makatia/mistral-7b-instruct-v0.2.Q8_0-Q8_0.gguf

7B • Updated 8 days ago • 27

JonathanMiddleton/Qwen3-Reranker-4B-GGUF

Text Ranking • 4B • Updated 7 days ago • 81

Arivukkarasu/TinyLlama-1.1B-Chat-GGUF

1B • Updated 6 days ago • 24

Arivukkarasu/Mistral-7B-Instruct-v0.3-GGUF

7B • Updated 6 days ago • 16

PJEDeveloper/Mistral_Nemo_Instruct_2407-F16.gguf-Q4_K_M

12B • Updated 2 days ago • 32

theprint/Zeth-Gemma3-4B-GGUF

Text Generation • 5B • Updated 1 day ago • 34