ExLlamaV2 Quantization

I applied the last step of my continuous finetuning method to the Nemotron-70b model from Nvidia. More details bellow:

Quants: (Coming Soon)

Open-LLM-Leaderboard scores: (Coming soon)

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

UnstableLlama
/

Rombos-LLM-V2.6-Nemotron-70b-exl2