FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language
Paper
โข
2506.20920
โข
Published
โข
26
None defined yet.
retrain-pipelines 0.1.2
finally dropped. It comes with a hot Hugging Face Hub integration. Go check it out. We have 2 articles about it coming up. One already fully written so, be on the lookout !retrain-pipelines 0.1.1
today. The doc is also pimped compared to previous release. That was clearly not mature then.