Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
merve
's Collections
January 31 Releases π§€
Models, Jan 27
Jan 24 Releases
Jan 17 Releases βοΈ
Jan 10 Releases π¨οΈ
Dec 6 Releases π
Nov 29 Releases π²π²
Nov 22 Releases βοΈ
Nov 15 Releases π
Nov 1 Releases
MIT Talk 31/10 Papers
October 25 Releases
LOTUS πͺ·
New Depth Models
BRAVE Models π¦
Computer Vision Backbones π§©
Image Classification Models πΆ π±
Object Detection Models π₯₯
Image Segmentation Models π
Zero-shot Image Classification Models πΌοΈ
Image-to-Image Models π¨
Video Classification Models πΊ
Image-to-Text Models π
Text-to-Image Models π₯
Foundation Models for Vision π§©
Segment Anything Model
OWL-series π¦
SigLIP
Awesome Document AI
SegGPT
Vision Language Models Papers πΌοΈπ¬π
gvhf/owl
gv-hf/owl
merve/owl2
Depth Anything v2 Release
Document VLM Papers
Vision Language Leaderboards
Video Language Models
SAM2
NVEagle
Multimodal RAG
Zero-shot Segmentation
Jan 24 Releases
updated
6 days ago
Upvote
6
ostris/Flex.1-alpha
Text-to-Image
β’
Updated
11 days ago
β’
15.4k
β’
316
Qwen/Qwen2.5-Math-PRM-72B
Text Classification
β’
Updated
13 days ago
β’
1k
β’
65
HuggingFaceTB/SmolVLM-500M-Instruct
Image-Text-to-Text
β’
Updated
7 days ago
β’
7.72k
β’
80
deepseek-ai/DeepSeek-R1
Text Generation
β’
Updated
4 days ago
β’
498k
β’
5.17k
yale-nlp/MMVU
Viewer
β’
Updated
4 days ago
β’
1k
β’
5.16k
β’
53
cais/hle
Viewer
β’
Updated
7 days ago
β’
3k
β’
1.61k
β’
131
nvidia/AceMath-7B-Instruct
Text Generation
β’
Updated
13 days ago
β’
764
β’
9
tencent/Hunyuan3D-2
Image-to-3D
β’
Updated
6 days ago
β’
24.2k
β’
641
nvidia/AceMath-Instruct-Training-Data
Viewer
β’
Updated
13 days ago
β’
5.56M
β’
2.74k
β’
37
bytedance-research/UI-TARS-72B-DPO
Image-Text-to-Text
β’
Updated
5 days ago
β’
6.1k
β’
74
declare-lab/TangoFlux
Text-to-Audio
β’
Updated
8 days ago
β’
3.02k
β’
76
Running
on
Zero
955
π
Hunyuan3D-2.0
Text-to-3D and Image-to-3D Generation
nvidia/AceMath-72B-Instruct
Text Generation
β’
Updated
13 days ago
β’
74
β’
6
vidore/colSmol-256M
Updated
7 days ago
β’
738
β’
5
nvidia/AceMath-72B-RM
Text Generation
β’
Updated
13 days ago
β’
21
β’
6
MiniMaxAI/MiniMax-VL-01
Image-Text-to-Text
β’
Updated
5 days ago
β’
2.02k
β’
224
DAMO-NLP-SG/VideoLLaMA3-2B-Image
Visual Question Answering
β’
Updated
6 days ago
β’
404
β’
6
DAMO-NLP-SG/VideoLLaMA3-2B
Visual Question Answering
β’
Updated
3 days ago
β’
749
β’
5
DAMO-NLP-SG/VideoLLaMA3-7B-Image
Visual Question Answering
β’
Updated
6 days ago
β’
422
β’
8
DAMO-NLP-SG/VideoLLaMA3-7B
Visual Question Answering
β’
Updated
3 days ago
β’
2.24k
β’
28
bytedance-research/UI-TARS-72B-SFT
Image-Text-to-Text
β’
Updated
5 days ago
β’
270
β’
10
bytedance-research/UI-TARS-7B-SFT
Image-Text-to-Text
β’
Updated
5 days ago
β’
2.56k
β’
124
bytedance-research/UI-TARS-7B-DPO
Image-Text-to-Text
β’
Updated
5 days ago
β’
15.3k
β’
98
HuggingFaceTB/SmolVLM-256M-Instruct
Image-Text-to-Text
β’
Updated
7 days ago
β’
11.7k
β’
106
HuggingFaceTB/SmolVLM-256M-Base
Image-Text-to-Text
β’
Updated
10 days ago
β’
1.24k
β’
8
HuggingFaceTB/SmolVLM-500M-Base
Image-Text-to-Text
β’
Updated
10 days ago
β’
233
β’
7
Running
on
Zero
38
π
SmolVLM
Running
46
π¬
MiniMaxVL01
vidore/colSmol-500M
Updated
7 days ago
β’
218
β’
8
Running
32
π¨
SmolVLM 256M Instruct WebGPU
Running
24
π»
SmolVLM 500M Instruct WebGPU
HKUSTAudio/Llasa-3B
Text-to-Speech
β’
Updated
4 days ago
β’
4.6k
β’
381
HKUSTAudio/xcodec2
Audio-to-Audio
β’
Updated
6 days ago
β’
9.23k
β’
27
Qwen/Qwen2.5-Math-PRM-7B
Text Classification
β’
Updated
13 days ago
β’
10.4k
β’
48
nvidia/AceMath-7B-RM
Text Generation
β’
Updated
13 days ago
β’
32
β’
5
vidore/colSmol-500M-base
Updated
7 days ago
β’
3
β’
1
vidore/colSmol-256M-base
Updated
7 days ago
β’
2
β’
1
Running
on
Zero
275
π
TangoFlux
Text to Audio (Sound SFX) Generator
Running
on
Zero
52
π₯οΈ
Flex.1-alpha
Upvote
6
+2
Share collection
View history
Collection guide
Browse collections