Quantify the request.

#1
by PLAYER55868 - opened

We sincerely and earnestly request the release of the AWQ Q4 quantization of gemm3 27b. Thank you for your contributions.

I'm not sure if AWQ is supported; the only supported datasets are 'contextual' for gemma and usually I use wikitext2 with AWQ. Instead, GPTQ might work, but I'm not sure. Will look at the export config later. Either way, 'stock' int4 will work but be warned; I remember from gemma2 27b that int4 brought the model down to ~15.4 gb, barely enough space for an A770. Add SigLIP 2 and may be more. Plus openivno quantization is dynamic and change model to model to size can be hard to predict.

However, for zero shot image classification there may be hope. For chat? I dont think so, not 27b.

Also, check out the project these are for https://github.com/SearchSavior/OpenArc, merging vision support today or tomorrow. There is a discord link in the repo

Sign up or log in to comment