Could you help to create the 3.25bpw model so it can fits on A100?
#1
by
davideuler
- opened
Thanks for the great job.
The 3.25bpw mistral-8x22B-0.1 works fine on A100. Could you help to create the 3.25bpw model?
Sure. I'll add that to my list.
You've already uploaded the 3.5bpw model, I think it should works on A100. I'll test it. Many thanks, Dracones.
The 4.0bpw model works on A100 80G. I set the max_seq_len to 8192 to avoid OOM on tabbyAPI.
I've also posted a 3.25bpw model if you want to play with that at higher context lengths: https://huggingface.co/Dracones/WizardLM-2-8x22B_exl2_3.25bpw
Dracones
changed discussion status to
closed