Expanding inputs for image tokens in LLaVa-NeXT should be done in processing.
I was using the example code in model card for image understanding, but I get these messages which I am not sure should be cared for. If so, what should I do? Thanks a lot !!!
Expanding inputs for image tokens in LLaVa-NeXT should be done in processing.
Please add `patch_size` and `vision_feature_select_strategy` to the model's processing config or set directly with `processor.patch_size = {{patch_size}}` and processor.
vision_feature_select_strategy = {{vision_feature_select_strategy}}`.
Using processors without these attributes in the config is deprecated and will throw an error in v4.47.
The same message appears when I use LLaVA 1.5 models, quite strange cause I stick strictly to the code provided in model cards
I solved this problem by adding 2 lines when in llava-1.5-7b-hf initialization:
self.processor.patch_size = self.model.config.vision_config.patch_size
self.processor.vision_feature_select_strategy = self.model.config.vision_feature_select_strategy
The code above means that I point out the patch_size and vision_feature_select_strategy manually using the same values from model.config.
Hey everyone, the official model config will be updated with the new params soon, most prob in 2-3 weeks. That should eliminate any recent bugs with latency/indexing etc.
@RaushanTurganbay Hi! Could you confirm if the quick fix mentioned above for the model config is sufficient, or could you help update the config soon? We're on a tight deadline and want to prevent any potential issues, would really appreciate your help with updating it