Xin Yan's picture

2 7 2

Xin Yan

Cakeyan

·

https://cakeyan.github.io/

AI & ML interests

Reasoning in NLP and CV.

Organizations

None yet

authored 2 papers 4 months ago

Decoupling Contrastive Decoding: Robust Hallucination Mitigation in Multimodal Large Language Models

Paper • 2504.08809 • Published Apr 9, 2025 • 1

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published Sep 24, 2025 • 82

authored a paper about 1 year ago

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Paper • 2412.01316 • Published Dec 2, 2024 • 10

authored 3 papers almost 2 years ago

Centroid-centered Modeling for Efficient Vision Transformer Pre-training

Paper • 2303.04664 • Published Mar 8, 2023

ContPhy: Continuum Physical Concept Learning and Reasoning from Videos

Paper • 2402.06119 • Published Feb 9, 2024 • 1

3D-VLA: A 3D Vision-Language-Action Generative World Model

Paper • 2403.09631 • Published Mar 14, 2024 • 12