4-bit Llammas in gguf

This is a 4-bit quantized version of TartuNLP/Llammas Llama2 model in gguf file format.

Downloads last month
2
GGUF
Model size
6.74B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support