Much thanks for contributing the code! this is a new approach, I'm using 4-bit and 8-bit quantization with accelerate.
· Sign up or log in to comment