Model config.json has Mistral params instead of Mixtral, breaking ExLlama quants and maybe affecting others too

#3
by TheBloke - opened

I got reports that ExLlamav2 wasn't working with this GPTQ. Turns out that's because it's trying to load it as a Mistral model, which is due to the architecture being set to Mistral instead of Mixtral

Also, the rope_theta should be 1000000.0 for Mixtral - this can affect inference quality.

I don't think any of this would stop k-quants working though, so that issue might be unrelated. I'll try making some anyway though.

Undi95 changed pull request status to merged
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment