view article Article KV Cache from scratch in nanoVLM By ariG23498 and 4 others • 25 days ago • 79
view article Article Vision Language Models (Better, Faster, Stronger) By merve and 4 others • May 12 • 465
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20 • 130
view article Article nanoVLM: The simplest repository to train your VLM in pure PyTorch By ariG23498 and 6 others • May 21 • 174
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated May 21 • 145