--leave-output-tensor !

#13

by ZeroWw - opened May 17, 2024

Discussion

ZeroWw

May 17, 2024

all quants should also have an alternate version quantized using --leave-output-tensor

so we can see if that 30% bigger file has better performances...

munish0838

Quant Factory org May 18, 2024

•

edited May 18, 2024

@ZeroWw would you like those for this model or any other specific model, and any specific sizes? Will try to include in future models

ZeroWw

May 21, 2024

I made some tests,,, the model as of now that resists better to quantization is Mistral-7b-Instruct-v0.2
With llama-3-8b I am having horrible results even at q8_0.
Thanks for the offer though.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment