GGUF support
#4
by
RedEyed
- opened
Hello, model looks very promising!
I want to try it locally via llama.cpp/ollama, will the model be available in GGUF format?
Thank you.
Always the same bulls*** .... nerds get top priority, but the average person who uses GGUF comes second... sigh
I pushed a safetensors fp8 you can run on 3090 for now.
Working on llamacpp today. Which is required to even get a gguf. Nemotron-h is a new hybrid architecture.
It’s not some trivial thing. It’s a 57 layer hybrid state space model interwoven with transformer MLP layers.