Is 2x13B or 3x13B possible?

by itsnottme - opened Feb 4, 2024

Discussion

itsnottme

Feb 4, 2024

Great model. I am wondering though if a similar model with less experts is possible, something like 2x13B or 3x13B, so it would fit on smaller computers.

itsnottme changed discussion status to closed Feb 12, 2024

Undi95

Owner Feb 12, 2024

Hello, sorry I didn't saw your message earlier, it was not possible back in the day because, for GGUF, Llama.cpp accepted only model of ^2 experts, but not 2, so it was only 4, 8, or 16 (or 32, 64...)

It could be possible today, but Llama2 model below 70B don't have GQA, and it make the model very heavy to load and use. So I don't think I will do another Mixtral-like MoE model made of Llama2, like this one.

Hope it answer the question.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment