Could you help to create the 3.25bpw model so it can fits on A100?

#1
by davideuler - opened

Thanks for the great job.
The 3.25bpw mistral-8x22B-0.1 works fine on A100. Could you help to create the 3.25bpw model?

Sure. I'll add that to my list.

You've already uploaded the 3.5bpw model, I think it should works on A100. I'll test it. Many thanks, Dracones.

The 4.0bpw model works on A100 80G. I set the max_seq_len to 8192 to avoid OOM on tabbyAPI.

I've also posted a 3.25bpw model if you want to play with that at higher context lengths: https://huggingface.co/Dracones/WizardLM-2-8x22B_exl2_3.25bpw

Dracones changed discussion status to closed

Sign up or log in to comment