Model does not load
#1
by
ekuznets
- opened
It does not load with current transformers.
The problem is that there are numerous weight name mismatches. E.g. transformers want language_model.model.layers.0.feed_forward.experts.gate_up_proj, but the model actually contains language_model.model.layers.0.feed_forward.experts.gate_up_proj.weight.
Not sure how this could have happened. I see that gate_up_proj is a nn.Parameter in llama4: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama4/modeling_llama4.py#L55 (in several other models, modules with this name are nn.Linear, and, in those, weight names would end with "gate_up_proj.weight".)