Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
takara-ai 's Collections
Image Datasets
SwarmFormer
3D
Medical
Synthetic Data Generation
LLM Performance
Foundational Vision
LLM Scaling
Model Security
VLM Performance
Small LLM’s
Large LLM's
MultiModal
Autonomous Agents
Audio

VLM Performance

updated Jul 10, 2024
Upvote
-

  • Vision language models are blind

    Paper • 2407.06581 • Published Jul 9, 2024 • 83

    Note Use the BlindTest Eval benchmark for vision tasks that are easy for humans.

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs