Data-Juicer

community

https://github.com/modelscope/data-juicer

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

yxdyc updated a Space 24 days ago

datajuicer/README

lingzhq11 authored a paper 26 days ago

MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions

LuckyBanana authored a paper 11 months ago

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

View all activity

yxdyc

updated a Space 24 days ago

README

🐢

lingzhq11

authored a paper 26 days ago

MindGYM: Enhancing Vision-Language Models via Synthetic Self-Challenging Questions

Paper • 2503.09499 • Published Mar 12 • 3

LuckyBanana

authored a paper 11 months ago

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

Paper • 2408.04594 • Published Aug 8, 2024 • 15

yxdyc

authored a paper 11 months ago

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

Paper • 2408.04594 • Published Aug 8, 2024 • 15

yxdyc

authored a paper 12 months ago

The Synergy between Data and Multi-Modal Large Language Models: A Survey from Co-Development Perspective

Paper • 2407.08583 • Published Jul 11, 2024 • 13

yxdyc

authored a paper about 1 year ago

Data Mixing Made Efficient: A Bivariate Scaling Law for Language Model Pretraining

Paper • 2405.14908 • Published May 23, 2024 • 16

yxdyc

authored 2 papers over 1 year ago

AgentScope: A Flexible yet Robust Multi-Agent Platform

Paper • 2402.14034 • Published Feb 21, 2024 • 14

Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes

Paper • 2312.06353 • Published Dec 11, 2023 • 7

zhijianma

updated 2 datasets over 1 year ago

datajuicer/alpaca-cot-en-refined-by-data-juicer

Viewer • Updated Nov 10, 2023 • 5 • 20

datajuicer/alpaca-cot-zh-refined-by-data-juicer

Viewer • Updated Nov 10, 2023 • 5 • 33 • 5

zhijianma

updated 6 models over 1 year ago

Descartes

updated 4 datasets over 1 year ago

datajuicer/the-pile-nih-refined-by-data-juicer

Viewer • Updated Oct 23, 2023 • 100 • 22

datajuicer/the-pile-pubmed-abstracts-refined-by-data-juicer

Viewer • Updated Oct 23, 2023 • 100 • 32 • 2

datajuicer/the-pile-pubmed-central-refined-by-data-juicer

Viewer • Updated Oct 23, 2023 • 100 • 30 • 2

datajuicer/the-pile-europarl-refined-by-data-juicer

Viewer • Updated Oct 23, 2023 • 100 • 16

AI & ML interests

Recent Activity

Team members 14

datajuicer's activity

README