Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Models filters
Tasks
Libraries
Datasets
Languages
Licenses
Other
1
Inference Providers
Select all
Together AI
Nscale
Fireworks
Nebius AI Studio
Replicate
Cohere
Novita
Hyperbolic
fal
Cerebras
SambaNova
HF Inference API
Misc
Reset Misc
Inference Endpoints
custom_code
visual-question-answering
text-generation-inference
4-bit precision
8-bit precision
Eval Results
Merge
Mixture of Experts
Misc with no match
text-embeddings-inference
Carbon Emissions
Apply filters
Models
623
Full-text search
Edit filters
Sort: Trending
Active filters:
visual-question-answering
Clear all
microsoft/Phi-4-multimodal-instruct
Automatic Speech Recognition
•
Updated
May 1
•
467k
•
1.41k
DAMO-NLP-SG/VideoLLaMA3-7B
Visual Question Answering
•
Updated
Mar 20
•
91.4k
•
59
Salesforce/blip2-opt-2.7b
Image-Text-to-Text
•
Updated
Feb 3
•
913k
•
378
openbmb/MiniCPM-V-2
Visual Question Answering
•
Updated
Jan 15
•
6.07k
•
475
microsoft/OmniParser
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
611
•
1.67k
google/cxr-foundation
Image Classification
•
Updated
Feb 20
•
89
•
75
remyxai/SpaceThinker-Qwen2.5VL-3B
Image-Text-to-Text
•
Updated
about 22 hours ago
•
1.95k
•
14
bharatgenai/patram-7b-instruct
Image-Text-to-Text
•
Updated
1 day ago
•
12
•
2
Salesforce/blip-vqa-base
Visual Question Answering
•
Updated
Feb 3
•
701k
•
159
unum-cloud/uform-gen2-qwen-500m
Image-to-Text
•
Updated
Apr 24, 2024
•
4.7k
•
77
microsoft/Phi-4-multimodal-instruct-onnx
Automatic Speech Recognition
•
Updated
Mar 3
•
114
•
70
ritzzai/GUI-R1
Visual Question Answering
•
Updated
Apr 21
•
4
dandelin/vilt-b32-finetuned-vqa
Visual Question Answering
•
Updated
Aug 2, 2022
•
128k
•
413
azwierzc/vilt-b32-finetuned-vqa-pl
Visual Question Answering
•
Updated
Mar 21, 2022
•
6
Bingsu/temp_vilt_vqa
Visual Question Answering
•
Updated
Nov 28, 2022
•
8
microsoft/git-base-vqav2
Visual Question Answering
•
Updated
Mar 9, 2024
•
205
•
19
microsoft/git-base-textvqa
Visual Question Answering
•
Updated
Mar 29, 2024
•
1.65k
•
6
Salesforce/blip-vqa-capfilt-large
Visual Question Answering
•
Updated
Feb 3
•
210k
•
52
tufa15nik/vilt-finetuned-vqasi
Visual Question Answering
•
Updated
Dec 15, 2022
•
5
microsoft/git-large-vqav2
Visual Question Answering
•
Updated
Sep 7, 2023
•
319
•
17
microsoft/git-large-textvqa
Visual Question Answering
•
Updated
Apr 9, 2024
•
69
•
4
ivelin/donut-refexp-combined-v1
Visual Question Answering
•
Updated
Feb 7, 2023
•
14
•
4
tifa-benchmark/promptcap-coco-vqa
Image-to-Text
•
Updated
Dec 11, 2023
•
211
•
12
Salesforce/blip2-flan-t5-xl
Image-Text-to-Text
•
Updated
Feb 3
•
124k
•
76
sheldonxxxx/OFA_model_weights
Visual Question Answering
•
Updated
Feb 8, 2023
•
1
Salesforce/blip2-opt-6.7b
Image-Text-to-Text
•
Updated
Feb 3
•
8.03k
•
77
Salesforce/blip2-opt-2.7b-coco
Image-to-Text
•
Updated
Feb 3
•
3.76k
•
9
Salesforce/blip2-opt-6.7b-coco
Image-Text-to-Text
•
Updated
Feb 3
•
59.8k
•
34
Salesforce/blip2-flan-t5-xl-coco
Image-to-Text
•
Updated
Feb 3
•
1.82k
•
15
Salesforce/blip2-flan-t5-xxl
Image-Text-to-Text
•
Updated
Feb 3
•
7.24k
•
90
Previous
1
2
3
...
21
Next