Why did you cut off the vision encoder of the model?

by PavelSaranskiy - opened 9 days ago

9 days ago

Why did you cut off the vision encoder of the model in this version? Or maybe made something else to disable LLM's vision capabilities. Only for less consumed storage? I run models on secondary GPU therefore pre-fill and vision encoding consumes compute and VRAM not from main large secondary GPU but from cheap main old GPU. Maybe it is the reason why I see only 0.3GB difference in taken vram of large GPU between yours version and old bartowski's version.

stduhpf

Owner 9 days ago

I didn't include the vision encoder because I was mainly focused on testing the text performance. You can use the same mmproj as bartowski's or google's version, they are compatible (I just tested it). I'll upload it soon if it helps.

stduhpf changed discussion status to closed 9 days ago

PavelSaranskiy

8 days ago

Thanks. I see now (and models can see now xD).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment