Tony Zhao

tianchez

AI & ML interests

Multimodal Agent, Generative AI

Recent Activity

Organizations

Om AI Lab's profile picture

Posts 1

view post
Post
4181
Introducing VLM-R1!

GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks?

The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task).

https://github.com/om-ai-lab/VLM-R1

Articles 2

Article
1

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

datasets

None public yet