NVIDIA Nemotron Collection Open, Production-ready Enterprise Models. Nvidia Open Model license. • 3 items • Updated 4 days ago • 44
DINOv3 Collection DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated about 13 hours ago • 212
view article Article Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face By abidlabs and 4 others • 24 days ago • 157
Common Pile v0.1 Raw Data Collection 8TB of public domain and openly licensed text • 30 items • Updated 8 days ago • 18
GLM-4.5 Collection GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated 11 days ago • 218
Institutional Books Collection A growing corpus of public domain books from library collections, seeded by Harvard Library. • 3 items • Updated Jun 11 • 6
Survivor Library Books - OCR Collection Books from the Survivor Library (mostly ~1920s & earlier) OCR'd with recent VLMs • 2 items • Updated Jul 14 • 5
SmolLM3 pretraining datasets Collection datasets used in SmolLM3 pretraining • 15 items • Updated 10 days ago • 28
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 635
Energy-Based Transformers are Scalable Learners and Thinkers Paper • 2507.02092 • Published Jul 2 • 60
ERNIE 4.5 Collection collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 25 items • Updated Jul 11 • 158
✂️ Abliteration Collection Uncensored models using abliteration. See this article for more information: huggingface.co/blog/mlabonne/abliteration • 34 items • Updated 28 days ago • 108
📀 Dataset comparison models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 40