Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ceyda 's Collections
Korean Models
Useful Tools
vid-gen
Clips
VQA (Image captioning,QA)
Color
Nice~
Fashion
Cool names

VQA (Image captioning,QA)

updated Dec 26, 2024
Upvote
-

  • Running
    35
    35

    FuseCap

    📊

    Generate captions for images


  • Runtime error
    424
    424

    Kosmos 2

    💻

    Generate a detailed image caption with highlighted entities


  • Running
    7
    7

    Vilt Nlvr

    🚀

    Compare two images with a sentence


  • Build error
    125
    125

    Qwen VL

    ⚡


  • Running on T4
    418
    418

    LLaVA

    🔥

    Chat with an AI assistant using text and images


  • Runtime error
    309
    309

    Fuyu Multimodal

    👁


  • Runtime error
    156
    156

    MoE LLaVA

    🚀


  • Runtime error
    167
    167

    IDEFICS2 Playground

    🐨


  • Running on Zero
    82
    82

    CuMo 7b Zero

    🐐

    Generate text based on images and text input


  • What matters when building vision-language models?

    Paper • 2405.02246 • Published May 3, 2024 • 104

  • Running
    431
    431

    moondream2

    🌔

    a tiny vision language model


  • Running
    102
    102

    Idefics3

    📊

    Generate text based on an image and prompt

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs