Dataset for evaluating the visual perception capabilities of LVLMs.
-
VisOnlyQA: Large Vision Language Models Still Struggle with Visual Perception of Geometric Information
Paper • 2412.00947 • Published • 8 -
ryokamoi/VisOnlyQA_Eval_Real_v1.1
Viewer • Updated • 900 • 282 -
ryokamoi/VisOnlyQA_Eval_Synthetic
Viewer • Updated • 700 • 112 • 2 -
ryokamoi/VisOnlyQA_Train
Viewer • Updated • 70k • 190 • 2