UniME is a series of multimodal large language models trained for learning universal multimodal embedding.
-
DeepGlint-AI/UniME-Phi3.5-V-4.2B
Image-Text-to-Text • Updated • 440 • 7 -
DeepGlint-AI/UniME-LLaVA-1.6-7B
Image-Text-to-Text • Updated • 72 • 5 -
Breaking the Modality Barrier: Universal Embedding Learning with Multimodal LLMs
Paper • 2504.17432 • Published • 39 -
DeepGlint-AI/UniME-LLaVA-OneVision-7B
Image-Text-to-Text • Updated • 726 • 2