MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions Paper • 2503.09499 • Published Mar 12 • 3
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models Paper • 2408.04594 • Published Aug 8, 2024 • 15
Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models Paper • 2408.04594 • Published Aug 8, 2024 • 15
The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective Paper • 2407.08583 • Published Jul 11, 2024 • 13
Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining Paper • 2405.14908 • Published May 23, 2024 • 16
AgentScope: A Flexible yet Robust Multi-Agent Platform Paper • 2402.14034 • Published Feb 21, 2024 • 13
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes Paper • 2312.06353 • Published Dec 11, 2023 • 7
datajuicer/the-pile-pubmed-abstracts-refined-by-data-juicer Viewer • Updated Oct 23, 2023 • 100 • 21 • 2
datajuicer/the-pile-pubmed-central-refined-by-data-juicer Viewer • Updated Oct 23, 2023 • 100 • 32 • 2