5 10 4

Boyuan Sun

BBBBCHAN

AI & ML interests

None yet

Recent Activity

authored a paper 5 days ago

Depth Anything at Any Condition

liked a model 5 days ago

ghost233lism/DepthAnything-AC

new activity 5 days ago

ghost233lism/DepthAnything-AC:Update README.md

View all activity

Organizations

None yet

authored a paper 5 days ago

Depth Anything at Any Condition

Paper • 2507.01634 • Published 6 days ago • 45

liked a model 5 days ago

ghost233lism/DepthAnything-AC

Depth Estimation • Updated about 16 hours ago • 9

New activity in ghost233lism/DepthAnything-AC 5 days ago

Update README.md

#2 opened 5 days ago by

BBBBCHAN

upvoted 2 papers 5 days ago

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Paper • 2507.01006 • Published 6 days ago • 171

Depth Anything at Any Condition

Paper • 2507.01634 • Published 6 days ago • 45

commented a paper 5 days ago

Depth Anything at Any Condition

Paper • 2507.01634 • Published 6 days ago • 45 •

liked a model 6 days ago

PhilipC/HumanOmniV2

Video-Text-to-Text • 9B • Updated about 18 hours ago • 9 • 3

authored 6 papers 7 days ago

LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

Paper • 2506.21862 • Published 11 days ago • 34

HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context

Paper • 2506.21277 • Published 11 days ago • 14

HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding

Paper • 2501.15111 • Published Jan 25 • 1

Towards RAW Object Detection in Diverse Conditions

Paper • 2411.15678 • Published Nov 24, 2024

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding

Paper • 2501.05067 • Published Jan 9 • 1

Facial Dynamics in Video: Instruction Tuning for Improved Facial Expression Perception and Contextual Awareness

Paper • 2501.07978 • Published Jan 14

upvoted 3 papers 7 days ago

LLaVA-Octopus: Unlocking Instruction-Driven Adaptive Projector Fusion for Video Understanding

Paper • 2501.05067 • Published Jan 9 • 1

HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding

Paper • 2501.15111 • Published Jan 25 • 1

HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context

Paper • 2506.21277 • Published 11 days ago • 14

updated a model 7 days ago

BBBBCHAN/LLaVA-Scissor-baseline-7B

Video-Text-to-Text • 8B • Updated 7 days ago • 39 • 3

New activity in BBBBCHAN/LLaVA-Scissor-baseline-7B 7 days ago

Improve model card: Add pipeline tag, paper link, and GitHub repository link

#1 opened 8 days ago by

nielsr

updated a model 7 days ago

BBBBCHAN/LLaVA-Scissor-baseline-0.5B

Video-Text-to-Text • 0.9B • Updated 7 days ago • 37 • 4

New activity in BBBBCHAN/LLaVA-Scissor-baseline-0.5B 7 days ago

Add pipeline tag, links to paper, code, and project page

#1 opened 8 days ago by

nielsr

Boyuan Sun

AI & ML interests

Recent Activity

Organizations

BBBBCHAN's activity

Update README.md

Improve model card: Add pipeline tag, paper link, and GitHub repository link

Add pipeline tag, links to paper, code, and project page