Alternative quantizatioons.

#7
by ZeroWw - opened

https://huggingface.co/ZeroWw/Meta-Llama-3-8B-Instruct-abliterated-v3-GGUF

My own (ZeroWw) quantizations. output and embed tensors quantized to f16. all other tensors quantized to q5_k or q6_k.

Result: both f16.q6 and f16.q5 are smaller than q8_0 standard quantization and they perform as well as the pure f16.

@failspy hello! thanks for the abliterated versions.. can you please also do this one: https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-4194k

And, mistral instruct v03?

Sign up or log in to comment