4 10 10

Jie Shao

hehesang

http://www.lamda.nju.edu.cn/shaoj/

hehesangsj

AI & ML interests

computer vision, ai for science

Recent Activity

liked a model about 6 hours ago

Wuli-art/Qwen-Image-2512-Turbo-LoRA

authored a paper 4 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

liked a model 4 months ago

OpenGVLab/InternVL3_5-241B-A28B

View all activity

Organizations

liked a model about 6 hours ago

Wuli-art/Qwen-Image-2512-Turbo-LoRA

Text-to-Image • Updated about 6 hours ago • 14k • 108

authored a paper 4 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 211

liked a model 4 months ago

OpenGVLab/InternVL3_5-241B-A28B

Image-Text-to-Text • 241B • Updated Aug 29, 2025 • 1.1k • 133

upvoted a paper 4 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 211

liked 2 models 5 months ago

AIDC-AI/Ovis2.5-2B

Image-Text-to-Text • 3B • Updated Oct 24, 2025 • 53.5k • 198

AIDC-AI/Ovis2.5-9B

Image-Text-to-Text • 9B • Updated Oct 24, 2025 • 2.52k • 298

liked 2 Spaces 5 months ago

Ovis2.5 9B

📊

176

High-accuracy vision & reasoning for complex tasks

Ovis2.5 2B

📚

109

Lightweight vision for efficient deployment

upvoted a paper 5 months ago

MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents

Paper • 2507.19478 • Published Jul 25, 2025 • 31

liked a model 6 months ago

AIDC-AI/Ovis-U1-3B

Any-to-Any • 4B • Updated Jul 3, 2025 • 437 • 209

liked a dataset 6 months ago

OpenGVLab/MMBench-GUI

Preview • Updated Aug 15, 2025 • 84 • 36

upvoted a paper 8 months ago

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities

Paper • 2505.02567 • Published May 5, 2025 • 80

authored a paper 9 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 306

upvoted 3 papers 9 months ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 306

Envisioning Beyond the Pixels: Benchmarking Reasoning-Informed Visual Editing

Paper • 2504.02826 • Published Apr 3, 2025 • 68

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy

Paper • 2503.19757 • Published Mar 25, 2025 • 51

upvoted a paper 10 months ago

VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13, 2025 • 36

New activity in google/siglip2-giant-opt-patch16-384 10 months ago

AutoModel.from_pretrained error in loading state_dict

#3 opened 10 months ago by

Srymaker

upvoted a paper about 1 year ago

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Paper • 2412.09604 • Published Dec 12, 2024 • 38

authored a paper about 1 year ago

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Paper • 2412.09604 • Published Dec 12, 2024 • 38

Jie Shao

AI & ML interests

Recent Activity

Organizations

hehesang's activity

Ovis2.5 9B

Ovis2.5 2B

AutoModel.from_pretrained error in loading state_dict