"model_type" and "architectures" in config.json

#2
by KOKKKOKK - opened

Could you please explain what P6DenseForCausalLM and seed_p6dense are?
I can not use the model for inference via the transformers.
Thanks.

Traceback (most recent call last):
File "/liuchonghan/liuche/tmp.py", line 6, in
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
File "/opt/conda/envs/liuche/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 547, in from_pretrained
config, kwargs = AutoConfig.from_pretrained(
File "/opt/conda/envs/liuche/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1190, in from_pretrained
raise ValueError(
ValueError: The checkpoint you are trying to load has model type seed_p6dense but Transformers does not recognize this architecture.

BytedTsinghua-SIA org

Thank you for raising this issue and apologies for the confusion! We have updated the model by making it adaptable with the transformers implementations. P6Dense is a codename for our internal architecture implementation. Now you can load the model like any other models in HuggingFace. Let us know if there are any other issues!

Sign up or log in to comment