Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)
Joya Chen PRO
chenjoya
AI & ML interests
Video LLM
Recent Activity
upvoted
a
paper
6 days ago
Show-o2: Improved Native Unified Multimodal Models
liked
a model
8 days ago
showlab/show-o2-7B
updated
a dataset
8 days ago
chenjoya/Live-WhisperX-526K