Text Generation
Transformers
Safetensors
English
olmo2
conversational

32b, 4k ctx?

#2
by lucyknada - opened

is 4k the final context-length planned for this model? or is there more in the works?

I really like what they did with the the whole "fully open source" deal, but the 4k context length is indeed head-scratching. I'd also like to hear a word on this.

image.png

self replying, since apparently it was mentioned on twitter but not here, what "very soon" means is another question.

I can’t serve this model with a context length limited to 4K. A 4K context might be acceptable for smaller models (0.5B or 1B) intended for on-device use cases, but for a 32B model, I need it to support at least a 128K context window to achieve decent performance at 32K.

Anyone else getting

ollama run MHKetbi/allenai_OLMo2-0325-32B-Instruct:Q8_0 Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'blk.0.attn_k_norm.weight' has wrong shape; expected 5120, got 1024, 1, 1, 1

Hi @lucyknada , we’re working on making the context longer. We’re definitely planning to do that in the next versions. Stay tuned! Thanks everyone else for the feedback.

Anyone else getting

ollama run MHKetbi/allenai_OLMo2-0325-32B-Instruct:Q8_0 Error: llama runner process has terminated: error loading model: check_tensor_dims: tensor 'blk.0.attn_k_norm.weight' has wrong shape; expected 5120, got 1024, 1, 1, 1

Hey @TheSeminal , there is this issue on going with llama.cpp. Check this out for more context: https://huggingface.co/allenai/OLMo-2-0325-32B-Instruct-GGUF/discussions/1

Hi @lucyknada , we’re working on making the context longer. We’re definitely planning to do that in the next versions. Stay tuned! Thanks everyone else for the feedback.

That's fantastic to hear. Any chance there is a rough timeline for when that might happen? Thank you again for all you do!

Hey @mruderman , not sure when we're going to release it. But we're grinding on it.

Sign up or log in to comment