Unable to load with 128K context buffer.

#2
by TheBrinkster - opened

When I load the Q8 version of this model into LM Studio, max context = 8192 tokens, not 128K. How can we achieve the stated 128K context?

In LMS, you can override the 8192 with 131072 for 128k. That's not ideal, though, better to requant it and reupload. I'll put it on the list to do.

Sign up or log in to comment