DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 8 days ago • 270
BlueLM-V-3B: Algorithm and System Co-Design for Multimodal Large Language Models on Mobile Devices Paper • 2411.10640 • Published Nov 16, 2024 • 45
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper • 2412.03555 • Published Dec 4, 2024 • 124
NVILA: Efficient Frontier Visual Language Models Paper • 2412.04468 • Published Dec 5, 2024 • 57
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published Dec 5, 2024 • 105
Teach Multimodal LLMs to Comprehend Electrocardiographic Images Paper • 2410.19008 • Published Oct 21, 2024 • 23
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22, 2024 • 89
view article Article PaliGemma – Google's Cutting-Edge Open Vision Language Model May 14, 2024 • 234
view article Article Training Stable Diffusion with Dreambooth using 🧨 Diffusers Nov 7, 2022 • 16
RadRotator: 3D Rotation of Radiographs with Diffusion Models Paper • 2404.13000 • Published Apr 19, 2024 • 25
Understanding LLMs: A Comprehensive Overview from Training to Inference Paper • 2401.02038 • Published Jan 4, 2024 • 63
PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns Paper • 2312.04534 • Published Dec 7, 2023 • 6