Why did you cut off the vision encoder of the model?
#1
by
PavelSaranskiy
- opened
Why did you cut off the vision encoder of the model in this version? Or maybe made something else to disable LLM's vision capabilities. Only for less consumed storage? I run models on secondary GPU therefore pre-fill and vision encoding consumes compute and VRAM not from main large secondary GPU but from cheap main old GPU. Maybe it is the reason why I see only 0.3GB difference in taken vram of large GPU between yours version and old bartowski's version.
I didn't include the vision encoder because I was mainly focused on testing the text performance. You can use the same mmproj as bartowski's or google's version, they are compatible (I just tested it). I'll upload it soon if it helps.
stduhpf
changed discussion status to
closed
Thanks. I see now (and models can see now xD).