jiangchengchengNLP
/

L3.3-MS-Nevoria-70b-NVFP4A16

Text Generation

text-generation-inference

8-bit precision

compressed-tensors

Model card Files Files and versions

This is a checkpoint for quantization using llm-compressor, supporting vllm, sglang inference.

Downloads last month: 8

Safetensors

Model size

40.6B params

Tensor type

F32

·

BF16

·

F8_E4M3

·

U8

·

Model tree for jiangchengchengNLP/L3.3-MS-Nevoria-70b-NVFP4A16

Base model

Steelskull/L3.3-MS-Nevoria-70b

Quantized

(28)

this model