Context length: is it 128K (as mentioned in the model card) or 160K (as specified in config.json)?

#17
by Lissanro - opened

I noticed https://huggingface.co/deepseek-ai/DeepSeek-V3.1/blob/main/config.json mentions 160K context length rather 128K. Not sure if this is a mistake or maybe 128K limit mentioned in the model card is for input tokens with additional 32K on top reserved for output?

R1 0528 had 160K context as well, so I am curious if it was really reduced for V3.1 or is it still the same?

if you look at tokenizer_config.json, it actually says model_max_length 128K

Sign up or log in to comment