V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization Paper • 2411.02712 • Published Nov 5, 2024
Manager: Aggregating Insights from Unimodal Experts in Two-Tower VLMs and MLLMs Paper • 2506.11515 • Published Jun 13
AI4Research: A Survey of Artificial Intelligence for Scientific Research Paper • 2507.01903 • Published Jul 2 • 4
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published Jan 14 • 298
M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought Paper • 2405.16473 • Published May 26, 2024
Self-Constructed Context Decompilation with Fined-grained Alignment Enhancement Paper • 2406.17233 • Published Jun 25, 2024
A Two-Stage Framework with Self-Supervised Distillation For Cross-Domain Text Classification Paper • 2304.09820 • Published Apr 18, 2023
Text is no more Enough! A Benchmark for Profile-based Spoken Language Understanding Paper • 2112.11953 • Published Dec 22, 2021
Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models Paper • 2412.05939 • Published Dec 8, 2024 • 16