Unable to load with 128K context buffer.

by TheBrinkster - opened 25 days ago

25 days ago

When I load the Q8 version of this model into LM Studio, max context = 8192 tokens, not 128K. How can we achieve the stated 128K context?

Owner 24 days ago

In LMS, you can override the 8192 with 131072 for 128k. That's not ideal, though, better to requant it and reupload. I'll put it on the list to do.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment