Missing MultiModalLLM_PT Source Code for InternVideo2_chat_8B_HD_F16
#1
by
CosineOne
- opened
I'm trying to load the OpenGVLab/InternVideo2_chat_8B_HD_F16 model (pinned revision 0568fba18da65fc0c320e6a4be361ea40ce62a68) locally using AutoModel.from_pretrained(..., trust_remote_code=True).
The config.json for this revision specifies "architectures": ["MultiModalLLM_PT"] and "model_type": "mistral". However:
- The pinned revision 0568fba... contains no Python (.py) files.
- Its config.json lacks an auto_map entry to point to the MultiModalLLM_PT class definition.
- The main branch (the fallback for transformers) also does not seem to contain MultiModalLLM_PT (e.g., in a modeling_mistral.py or other files like modeling_internvideo2_vit.py or modeling_base.py).
This makes it impossible to load the model with trust_remote_code=True as the MultiModalLLM_PT class definition cannot be found.
Could you please provide guidance on where to find the source code for MultiModalLLM_PT for this model version, or if there are plans to add it to the repository?
Thank you!