AI & ML interests

Exploring the diversity in synthetic data for pretraining large (and smol) language models.

SynthD 's datasets

None public yet