Text Generation
Transformers
Safetensors
English
olmoe
Mixture of Experts
olmo

Branches step1200000-tokens5033B and fp32 do not exist

#1
by deltanym - opened

As you say in the model card, step1200000-tokens5033B should be the pre-annealing base model branch, however, it is not present in this repo. fp32 is also not there.

Yes I think the branch information is copied from the older model card and does not fully apply here maybe @soldni knows what needs to be changed in this model card?

I see - it's also not really documented how -0125 differs from -0924, at least, not here or anywhere I can find much detail - could you explain that, or is it just somewhere I didn't see?

The paper has some documentation on it in the Appendix: https://arxiv.org/abs/2409.02060

oh, thanks! I didn't find the updated version
so, the pre-train section is the same, just annealed / mid-trained on different data?

Ai2 org

Hi, thanks again for the inquiry! We’re currently working on closing out old tickets, so we’re closing this out for now, but if you require a follow-up response, please re-open and we will get back to you!

baileyk changed discussion status to closed

Sign up or log in to comment