ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs Paper • 2506.10128 • Published Jun 11 • 23
Cosmos-Reason1 Collection Multimodal world understanding through reasoning • 10 items • Updated 6 days ago • 35
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published Apr 21 • 66
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog • 10 items • Updated Apr 30 • 77
Eagle 2 Collection Eagle 2 is a family of frontier vision-language models with vision-centric design. The model supports 4K HD input, long-context video, and grounding. • 10 items • Updated 6 days ago • 36
LLaVA-Critic Collection as a general evaluator for assessing model performance • 6 items • Updated Oct 6, 2024 • 10