Chenxi Qin's picture

9

Chenxi Qin

DislikeRetinalSpike

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 20 days ago

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

upvoted a paper 2 months ago

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

upvoted a paper 7 months ago

SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context

View all activity

Organizations

None yet

upvoted a paper 20 days ago

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

Paper • 2603.17541 • Published 21 days ago • 20

upvoted a paper 2 months ago

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Paper • 2602.04804 • Published Feb 4 • 50

upvoted 5 papers 7 months ago

SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context

Paper • 2411.16213 • Published Nov 25, 2024 • 2

VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models

Paper • 2504.16359 • Published Apr 23, 2025 • 3

RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video

Paper • 2505.02064 • Published May 4, 2025 • 4

UltraEdit: Training-, Subject-, and Memory-Free Lifelong Editing in Large Language Models

Paper • 2505.14679 • Published May 20, 2025 • 5

PhysicsArena: The First Multimodal Physics Reasoning Benchmark Exploring Variable, Process, and Solution Dimensions

Paper • 2505.15472 • Published May 21, 2025 • 3

upvoted a collection 7 months ago

Streaming Video Benchmark

1 item • Updated Dec 20, 2025 • 1

upvoted a paper 7 months ago

Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents

Paper • 2508.19493 • Published Aug 27, 2025 • 11