MLLM - a doubao Collection

doubao 's Collections

4Fun

MLLM

MLLM

updated Dec 2, 2024

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 64
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation

Paper • 2410.11779 • Published Oct 15, 2024 • 26
What Matters in Transformers? Not All Attention is Needed

Paper • 2406.15786 • Published Jun 22, 2024 • 31
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

Paper • 2410.10774 • Published Oct 14, 2024 • 26
DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding

Paper • 2411.19527 • Published Nov 29, 2024 • 10