Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

text-generation-inference

8-bit precision

4-bit precision

Mixture of Experts

Misc with no match

text-embeddings-inference

Carbon Emissions

Models

719

Full-text search

Active filters: vllm

RedHatAI/Qwen2-57B-A14B-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 1.29k • 1

nm-testing/Meta-Llama-3-8B-Instruct-FP8-K-V

Text Generation • Updated Oct 9, 2024 • 23

RedHatAI/DeepSeek-Coder-V2-Lite-Instruct-FP8

Text Generation • Updated Jul 18, 2024 • 11.7k • 7

RedHatAI/DeepSeek-Coder-V2-Lite-Base-FP8

Text Generation • Updated Jul 18, 2024 • 22

mgoin/Mistral-Nemo-Instruct-2407-FP8-Dynamic

Text Generation • Updated Jul 18, 2024 • 138

mgoin/Mistral-Nemo-Instruct-2407-FP8-KV

Text Generation • Updated Jul 18, 2024 • 15

RedHatAI/Mistral-Nemo-Instruct-2407-FP8

Text Generation • Updated Jul 19, 2024 • 22.1k • 18

FlorianJc/Mistral-Nemo-Instruct-2407-vllm-fp8

Text Generation • Updated Jul 31, 2024 • 230 • 8

RedHatAI/DeepSeek-Coder-V2-Base-FP8

Text Generation • Updated Jul 22, 2024 • 63

RedHatAI/DeepSeek-Coder-V2-Instruct-FP8

Text Generation • Updated Jul 22, 2024 • 1.33k • 7

mgoin/Minitron-4B-Base-FP8

Text Generation • Updated Aug 16, 2024 • 25 • 3

mgoin/Minitron-8B-Base-FP8

Text Generation • Updated Jul 26, 2024 • 24 • 3

mgoin/nemotron-3-8b-chat-4k-sft-hf

Text Generation • Updated Nov 13, 2024 • 21

RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8

Text Generation • Updated Oct 9, 2024 • 229k • 43

RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic

Text Generation • Updated 7 days ago • 37.3k • 4

RedHatAI/Meta-Llama-3.1-70B-Instruct-FP8-dynamic

Text Generation • Updated Oct 19, 2024 • 2.49k • 6

RedHatAI/Meta-Llama-3.1-405B-Instruct-FP8

Text Generation • Updated Oct 9, 2024 • 1.84k • 31

RedHatAI/Meta-Llama-3.1-405B-Instruct-FP8-dynamic

Text Generation • Updated Oct 19, 2024 • 132 • 15

RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a16

Text Generation • Updated Oct 23, 2024 • 2.64k • 10

RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8

Text Generation • Updated 7 days ago • 11.7k • 17

mistralai/Mistral-Large-Instruct-2407

Updated Oct 16, 2024 • 13.2k • 832

mgoin/Nemotron-4-340B-Base-hf

Text Generation • Updated Aug 8, 2024 • 13 • 1

mgoin/Nemotron-4-340B-Base-hf-FP8

Text Generation • Updated Aug 8, 2024 • 127 • 2

RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w8a16

Text Generation • Updated Oct 9, 2024 • 514 • 5

mgoin/Nemotron-4-340B-Instruct-hf

Text Generation • Updated Aug 8, 2024 • 26 • 4

mgoin/Nemotron-4-340B-Instruct-hf-FP8

Text Generation • Updated Aug 8, 2024 • 178 • 3

FlorianJc/ghost-8b-beta-vllm-fp8

Text Generation • Updated Jul 25, 2024 • 22

FlorianJc/Meta-Llama-3.1-8B-Instruct-vllm-fp8

Text Generation • Updated Jul 25, 2024 • 36

RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w4a16

Text Generation • Updated 7 days ago • 22.2k • 27

RedHatAI/Meta-Llama-3.1-70B-Instruct-quantized.w8a8

Text Generation • Updated Feb 11 • 14.1k • 20