AI & ML interests
Exploring the diversity in synthetic data for pretraining large (and smol) language models.
SynthD
's datasets
None public yet
Exploring the diversity in synthetic data for pretraining large (and smol) language models.