Edit Models filters

Inference status

Misc

arxiv: 2210.17323

AutoTrain Compatible

text-generation-inference

Inference Endpoints

4-bit precision

8-bit precision

Carbon Emissions

Misc with no match

text-embeddings-inference

Mixture of Experts

Models

114

Full-text search

Active filters: 2210.17323

daedalus314/Marx-3B-V2-GPTQ

Text Generation • Updated Oct 12, 2023 • 19

TRAC-MTRY/traclm-v2-7b-instruct-GPTQ

Text Generation • Updated Dec 22, 2023

iproskurina/bloom-1b7-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 25

iproskurina/bloom-3b-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 21

iproskurina/bloom-560m-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 13

iproskurina/bloom-1b1-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 15

iproskurina/bloom-7b1-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 26 • 2

iproskurina/opt-350m-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 20

iproskurina/opt-1.3b-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 13

iproskurina/opt-2.7b-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 16

iproskurina/opt-6.7b-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 16

iproskurina/opt-13b-GPTQ-4bit-g128

Text Generation • Updated Sep 24 • 26

neuralmagic/zephyr-7b-beta-marlin

Text Generation • Updated Mar 6 • 8.33k

neuralmagic/TinyLlama-1.1B-Chat-v1.0-marlin

Text Generation • Updated Mar 6 • 5.28k • 1

neuralmagic/OpenHermes-2.5-Mistral-7B-marlin

Text Generation • Updated Mar 6 • 1.3k • 2

neuralmagic/Nous-Hermes-2-Yi-34B-marlin

Text Generation • Updated Mar 6 • 13 • 5

softmax/Llama-2-70b-chat-hf-marlin

Text Generation • Updated Mar 17 • 477

softmax/falcon-180B-chat-marlin

Text Generation • Updated Mar 21 • 13

smpanaro/Llama-2-7b-NuGPTQ

Text Generation • Updated Oct 12 • 25 • 1

TRAC-MTRY/traclm-v3-7b-instruct-GPTQ

Text Generation • Updated May 2

astronomer/Llama-3-8B-Instruct-GPTQ-8-Bit

Text Generation • Updated Apr 22 • 791 • 25

astronomer/Llama-3-8B-Instruct-GPTQ-4-Bit

Text Generation • Updated Apr 22 • 4.92k • 24

astronomer/Llama-3-8B-GPTQ-8-Bit

Text Generation • Updated Apr 22 • 29 • 2

astronomer/Llama-3-8B-GPTQ-4-Bit

Text Generation • Updated Apr 22 • 57 • 6

SwastikM/Llama-2-7B-Chat-text2code

Text Generation • Updated May 19 • 17 • 4

davidxmle/Llama-3-8B-Instruct-GPTQ-4-Bit-Debug

Text Generation • Updated Apr 30 • 7

IntelLabs/sqft-phi-3-mini-4k-50-base-gptq

Text Generation • Updated 15 days ago • 474 • 1

drbh/flash-attention-pre-compile

IntelLabs/sqft-mistral-7b-v0.3-50-base-gptq

Text Generation • Updated 15 days ago • 98 • 1

neuralmagic/Phi-3-mini-128k-instruct-quantized.w8a16

Text Generation • Updated Oct 9 • 23