JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation Paper • 2410.17250 • Published Oct 22, 2024 • 14
evborjnvioerjnvuowsetngboetgjbeigjaweuofjf/i-love-anime-sakuga Viewer • Updated 11 days ago • 1.49M • 144 • 18
What matters when building vision-language models? Paper • 2405.02246 • Published May 3, 2024 • 102