Why does Deepseek 67B have a bigger file size than 70B models?

by OrangeApples - opened Dec 5, 2023

Dec 5, 2023

@LoneStriker I'm comparing your 2.65bpw quants of Deepseek 67B and Euryale 1.4 L2 70B, and Deepseek's is bigger by about 700 MBs. However, when I checked TheBloke's GGUF quants of both, the opposite was true which is what I expected. Not really a big deal, but I am curious about what caused this.

LoneStriker

Owner Dec 5, 2023

I'm not sure actually, that's something we would need to ask the author of Exllamav2 for his insight. I have anecdotally noticed, however that Deepseek seems to take more VRAM vs. L2 70B under exl2 as well. So the size is reflected in the VRAM usage. As to why? I would just guess the tokenizer being 100K vs. 32K.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment