Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
deepakkumar07 's Collections
vision-llm
tamil-dataset
document-parser
text-to-speech
voice-to-text
Transformers model
csv-dataset

vision-llm

updated Sep 5
Upvote
-

  • Running
    109
    109

    Vision Papers

    💻

    All paper summaries read by Merve


  • Runtime error
    20
    20

    Ovis2 1B

    🦫

    Small model can do big things.


  • AIDC-AI/Ovis2-8B-GPTQ-Int4

    Image-Text-to-Text • 3B • Updated Mar 25 • 3.93k • 3

  • AIDC-AI/Ovis2-1B

    Image-Text-to-Text • 1B • Updated Aug 15 • 19.2k • 95

  • Runtime error
    13
    13

    Ovis2 8B

    🦫

    Ovis2-8B


  • lambdalabs/Llama-3.3-70B-Instruct-AWQ-4bit

    11B • Updated Dec 10, 2024 • 517 • 4

  • microsoft/GUI-Actor-7B-Qwen2-VL

    Image-Text-to-Text • 8B • Updated Aug 9 • 224 • 38

  • lambdalabs/sd-image-variations-diffusers

    Image-to-Image • Updated Feb 8, 2023 • 2.87k • 451

  • vikhyatk/moondream2

    Image-Text-to-Text • 2B • Updated 17 days ago • 355k • 1.31k

  • OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview

    Image-Text-to-Text • 0.4B • Updated Aug 29 • 54k • 72
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs