This model was converted and quantized using the official tools provided by llama.cpp
.
Serving:
./build/bin/llama-cli -m /models/nvidia-llama-3_1-nemotron-nano-8b-v1-q4_k_m.gguf -ngl 999
./build/bin/llama-server -m /models/nvidia-llama-3_1-nemotron-nano-8b-v1-q4_k_m.gguf -ngl 999
- Downloads last month
- 33
Hardware compatibility
Log In
to view the estimation
4-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for yeahdongcn/nvidia-llama-3_1-nemotron-nano-8b-v1-q4_k_m-gguf
Base model
nvidia/Llama-3.1-Nemotron-Nano-8B-v1