PhyCritic: Multimodal Critic Models for Physical AI Paper β’ 2602.11124 β’ Published 1 day ago β’ 45
RADIO Collection A collection of Foundation Vision Models that combine multiple models (CLIP, DINOv2, SAM, etc.). β’ 19 items β’ Updated 1 day ago β’ 32
google/siglip2-giant-opt-patch16-384 Zero-Shot Image Classification β’ 2B β’ Updated Feb 21, 2025 β’ 138k β’ 35
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning Paper β’ 2601.21468 β’ Published 15 days ago β’ 20
llm-semantic-router/multi-modal-embed-small Sentence Similarity β’ Updated 7 days ago β’ 108 β’ 12
DocReward: A Document Reward Model for Structuring and Stylizing Paper β’ 2510.11391 β’ Published Oct 13, 2025 β’ 27
OpenMed/Medical-Reasoning-SFT-Nemotron-Nano-30B Viewer β’ Updated 9 days ago β’ 445k β’ 694 β’ 33
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers Paper β’ 2601.14133 β’ Published 23 days ago β’ 60
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-NVFP4 Text Generation β’ 18B β’ Updated 3 days ago β’ 147k β’ 92