MGM-Omni Collection An open-source Omni Chatbot for Long Audio and Voice Clone ⢠12 items ⢠Updated 7 days ago ⢠6
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning Paper ⢠2507.13348 ⢠Published Jul 17 ⢠72
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition Paper ⢠2412.09501 ⢠Published Dec 12, 2024 ⢠49
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper ⢠2412.04467 ⢠Published Dec 5, 2024 ⢠118
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models ⢠11 items ⢠Updated Dec 6, 2024 ⢠683
MGM-Data Collection Official data collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" ⢠2 items ⢠Updated Apr 21, 2024 ⢠7
MGM Collection Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" ⢠13 items ⢠Updated May 3, 2024 ⢠47
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper ⢠2403.18814 ⢠Published Mar 27, 2024 ⢠48