Unable to load with 128K context buffer.
#2
by
TheBrinkster
- opened
When I load the Q8 version of this model into LM Studio, max context = 8192 tokens, not 128K. How can we achieve the stated 128K context?
In LMS, you can override the 8192 with 131072 for 128k. That's not ideal, though, better to requant it and reupload. I'll put it on the list to do.