·
AI & ML interests
Data curation, high-quality data, multilinguality, NLP & computational linguistics
Organizations
-
-
-
-
-
-
-
-
-
-
-
view article FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages
published an
article about 1 year ago view article FineWeb2-C: Help Build Better Language Models in Your Language
published an
article over 1 year ago view article Argilla 2.4: Easily Build Fine-Tuning and Evaluation Datasets on the Hub — No Code Required
- +1
published an
article over 1 year ago view article How to build a custom text classifier without days of human labeling
published an
article over 1 year ago view article How to optimize your data labelling project with custom interfaces