wuziheng's picture

1 6

wuziheng

wuziheng

·

wuziheng

AI & ML interests

CV/SSL/MultiMedia

Recent Activity

reacted to tianchez's post with 🚀 1 day ago

Introducing VLM-R1! GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks? The answer is YES and it generalizes better than SFT. We trained Qwen 2.5 VL 3B on RefCOCO (a visual grounding task) and eval on RefCOCO Val and RefGTA (an OOD task). https://github.com/om-ai-lab/VLM-R1

updated a model about 1 month ago

bytedance-research/Valley-Eagle-7B

new activity about 1 month ago

bytedance-research/Valley-Eagle-7B:Update README.md

View all activity

Organizations

Papers 2

arxiv:2309.12424

arxiv:2307.08579

models

None public yet

datasets

None public yet