GGUF support

by RedEyed - opened 3 days ago

3 days ago

Hello, model looks very promising!
I want to try it locally via llama.cpp/ollama, will the model be available in GGUF format?

Thank you.

2 days ago

Always the same bulls*** .... nerds get top priority, but the average person who uses GGUF comes second... sigh

about 18 hours ago

I pushed a safetensors fp8 you can run on 3090 for now.

Working on llamacpp today. Which is required to even get a gguf. Nemotron-h is a new hybrid architecture.

It’s not some trivial thing. It’s a 57 layer hybrid state space model interwoven with transformer MLP layers.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment