Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?

#17

by Lissanro - opened about 23 hours ago

about 23 hours ago

I noticed https://huggingface.co/deepseek-ai/DeepSeek-V3.1/blob/main/config.json mentions 160K context length rather 128K. Not sure if this is a mistake or maybe 128K limit mentioned in the model card is for input tokens with additional 32K on top reserved for output?

R1 0528 had 160K context as well, so I am curious if it was really reduced for V3.1 or is it still the same?

about 14 hours ago

if you look at tokenizer_config.json, it actually says model_max_length 128K

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment