Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study Paper • 2401.17981 • Published Jan 31, 2024 • 1
Data-Juicer: A One-Stop Data Processing System for Large Language Models Paper • 2309.02033 • Published Sep 5, 2023 • 4
DAMO-YOLO : A Report on Real-Time Object Detection Design Paper • 2211.15444 • Published Nov 23, 2022
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective Paper • 2407.08583 • Published Jul 11, 2024 • 13
Data-Juicer Sandbox: A Comprehensive Suite for Multimodal Data-Model Co-development Paper • 2407.11784 • Published Jul 16, 2024 • 4
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models Paper • 2408.04594 • Published Aug 8, 2024 • 15
DetailMaster: Can Your Text-to-Image Model Handle Long Prompts? Paper • 2505.16915 • Published May 22
Trinity-RFT: A General-Purpose and Unified Framework for Reinforcement Fine-Tuning of Large Language Models Paper • 2505.17826 • Published May 23 • 9
Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models Paper • 2501.14755 • Published Dec 23, 2024
DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network Paper • 2303.02165 • Published Mar 5, 2023