wcy1122
/

Qwen2.5-VL-7B-ViT

Model card Files Files and versions

Model Card for Qwen2.5-VL-7B-ViT

This is an unofficial, extracted NaViT Vision Encoder from Qwen2.5-VL-7B.
This model is used for MGM-Omni.

Downloads last month: 197

Safetensors

Model size

677M params

Tensor type

BF16

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wcy1122/Qwen2.5-VL-7B-ViT

Base model

Qwen/Qwen2.5-VL-7B-Instruct

Finetuned

(745)

this model

Collection including wcy1122/Qwen2.5-VL-7B-ViT

MGM-Omni

MGM-Omni: Scaling Omni LLMs to Personalized Long-Horizon Speech • 18 items • Updated about 18 hours ago • 9