Running on A100 226 Omnilingual ASR Media Transcription π 226 Transcribe audio or video into text in any language
baidu/ERNIE-4.5-VL-28B-A3B-Thinking Image-Text-to-Text β’ 30B β’ Updated 17 days ago β’ 533 β’ 515
Qwen/Qwen2.5-VL-7B-Instruct Image-Text-to-Text β’ 8B β’ Updated Apr 6, 2025 β’ 2.25M β’ β’ 1.42k
Running on Zero MCP 387 Multimodal OCR π 387 nanonets ocr2 / olmocr / qwen2vl ocr / aya vision / rolmocr
Qwen/Qwen2.5-VL-72B-Instruct Image-Text-to-Text β’ 73B β’ Updated Jun 6, 2025 β’ 72.6k β’ β’ 580
Running Featured 365 Qwen2.5 Omni 7B Demo π 365 Generate text and speech responses from text, audio, images, or video input
facebook/dinov3-vit7b16-pretrain-lvd1689m Image Feature Extraction β’ 7B β’ Updated Aug 19, 2025 β’ 12.1k β’ 201
nvidia/Llama-3.1-Nemotron-Nano-4B-v1.1 Text Generation β’ 5B β’ Updated Oct 15, 2025 β’ 1.44k β’ 111
nvidia/Llama-3_3-Nemotron-Super-49B-v1_5 Text Generation β’ 50B β’ Updated Oct 15, 2025 β’ 20.3k β’ 221
Qwen/Qwen3-Coder-480B-A35B-Instruct Text Generation β’ 480B β’ Updated Aug 21, 2025 β’ 18.8k β’ β’ 1.27k