Why did you cut off the vision encoder of the model?

#1
by PavelSaranskiy - opened

Why did you cut off the vision encoder of the model in this version? Or maybe made something else to disable LLM's vision capabilities. Only for less consumed storage? I run models on secondary GPU therefore pre-fill and vision encoding consumes compute and VRAM not from main large secondary GPU but from cheap main old GPU. Maybe it is the reason why I see only 0.3GB difference in taken vram of large GPU between yours version and old bartowski's version.

I didn't include the vision encoder because I was mainly focused on testing the text performance. You can use the same mmproj as bartowski's or google's version, they are compatible (I just tested it). I'll upload it soon if it helps.

stduhpf changed discussion status to closed

Thanks. I see now (and models can see now xD).

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment