3 7 24

Jihan Yang PRO

jihanyang

https://jihanyang.github.io/

AI & ML interests

Computer Vision, Multimodality, Embodied AI

Recent Activity

upvoted a paper 22 days ago

MetaCLIP 2: A Worldwide Scaling Recipe

upvoted a paper about 1 month ago

Scaling RL to Long Videos

updated a dataset 4 months ago

jihanyang/tomato

View all activity

Organizations

upvoted a paper 22 days ago

MetaCLIP 2: A Worldwide Scaling Recipe

Paper • 2507.22062 • Published 24 days ago • 22

upvoted a paper about 1 month ago

Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10 • 156

updated a dataset 4 months ago

jihanyang/tomato

Viewer • Updated Apr 27 • 1.48k • 3

published a dataset 4 months ago

jihanyang/tomato

Viewer • Updated Apr 27 • 1.48k • 3

liked a dataset 4 months ago

allenai/pixmo-points

Viewer • Updated Nov 27, 2024 • 2.38M • 2.48k • 30

updated a dataset 5 months ago

spatial-tuning/na_tasks_with_new_units

Viewer • Updated Mar 19 • 7.72k • 5

published 2 datasets 5 months ago

spatial-tuning/na_tasks_with_new_units

Viewer • Updated Mar 19 • 7.72k • 5

spatial-tuning/object_counting_120_gt2

Viewer • Updated Mar 10 • 383 • 3

updated a dataset 5 months ago

spatial-tuning/object_counting_120_gt2

Viewer • Updated Mar 10 • 383 • 3

authored a paper 6 months ago

UniTok: A Unified Tokenizer for Visual Generation and Understanding

Paper • 2502.20321 • Published Feb 27 • 30

liked a model 7 months ago

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27 • 747k • • 12.6k

authored a paper 7 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123

upvoted a paper 7 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123

updated a dataset 7 months ago

nyu-visionx/VSI-Bench

Viewer • Updated Jan 14 • 5.13k • 3.18k • 45

liked 2 datasets 8 months ago

ShareGPT4Video/ShareGPT4Video

Viewer • Updated Mar 7 • 40.2k • 2.59k • 199

nyu-visionx/VSI-Bench

Viewer • Updated Jan 14 • 5.13k • 3.18k • 45

authored a paper 8 months ago

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Paper • 2412.14171 • Published Dec 18, 2024 • 24

upvoted a paper 8 months ago

Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces

Paper • 2412.14171 • Published Dec 18, 2024 • 24

updated 2 datasets about 1 year ago

nyu-visionx/Cambrian-Alignment

Viewer • Updated Jul 23, 2024 • 292k • 5.11k • 35

jihanyang/RegionPLC_ScanNet200

Updated Jul 5, 2024

Jihan Yang PRO

AI & ML interests

Recent Activity

Organizations

jihanyang's activity