Reward Models Collection Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 9 days ago • 20
Step 1: Reproducing DeepSeek's Distilled Models Collection Code for training and evaluation: https://github.com/huggingface/open-r1 • 3 items • Updated May 26 • 3
view article Article Welcome GPT OSS, the new open-source model family from OpenAI! By reach-vb and 11 others • 19 days ago • 472
GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning Paper • 2507.19457 • Published 30 days ago • 24
view article Article Improving Parquet Dedupe on Hugging Face Hub By yuchenglow and 1 other • Oct 5, 2024 • 38
view article Article <p style="text-align:center;"> Bourbaki (7b): SOTA 7B Algorithms for Putnam Bench (Part I: Reasoning MDPs)</p> By hba123 and 2 others • Jul 13 • 11
Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts Paper • 2507.04569 • Published Jul 6 • 19
view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others • Jul 8 • 636
view article Article Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance By tiiuae and 5 others • May 21 • 34
view article Article Introducing the Open Arabic LLM Leaderboard By alielfilali01 and 4 others • May 14, 2024 • 97
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 164
AI2 Safety Toolkit Collection Safety data, moderation tools and safe LLMs. • 6 items • Updated Apr 30 • 8