Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Misc with no match

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

15

Full-text search

Active filters: smooth_quant

noneUsername/Orca-2-13b-w8-lmdeploy

Updated Nov 11, 2024 • 10

internlm/internlm3-8b-instruct-smoothquant-int8

Text Generation • Updated Jan 15 • 37 • 4

internlm/internlm3-8b-instruct-smoothquant-fp8

Text Generation • Updated Jan 17 • 28 • 1

fabiolecca/almawave-velvet-14b-int8

Updated Feb 1 • 6 • 2

G-reen/Qwen2.5-Coder-32b-Instruct-Fp8

Updated Feb 10 • 9

G-reen/Mistral-Small-2501-Instruct-Fp8

Updated Feb 9 • 102

radna/r1-14b-fp8

Updated Feb 28 • 6

radna/r1-7b-fp8

Updated Feb 28 • 27

radna/r1-14b-int8

Updated Mar 1 • 16

radna/r1-14b-float8_e4m3fn

Updated Mar 1 • 8

radna/r1-14b-float8_e5m2

Updated Mar 1 • 8

radna/r1-14b-int8-mid

Updated Mar 1 • 10

radna/r1-14b-float8_e4m3fn-mid

Updated Mar 1 • 5

radna/r1-14b-float8_e5m2-mid

Updated Mar 1 • 10

adriabama06/DeepCoder-1.5B-Preview-FP8-W8A8

Text Generation • Updated Apr 13 • 17 • 1