SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond Paper • 2505.19641 • Published 15 days ago • 64
🧠 Traditional Chinese Reasoning Datasets Collection A curated collection of datasets designed to evaluate and train reasoning capabilities in Traditional Chinese across various domains. • 3 items • Updated 27 days ago • 8
🏠 ParScale-1.8B Collection Base models trained on 1T high-quality tokens, demonstrating strong competitiveness among existing SOTA small models (<2B). • 4 items • Updated 23 days ago • 2
Phi-4 Collection Phi-4 family of small language, multi-modal and reasoning models. • 13 items • Updated May 1 • 154
LiveCC Collection Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025) • 8 items • Updated Apr 23 • 4
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 7 items • Updated May 1 • 42
Llama Nemotron Collection Open, Production-ready Enterprise Models • 8 items • Updated 3 days ago • 60
Physical AI Collection Collection of commercial-grade datasets for physical AI developers • 15 items • Updated 3 days ago • 55
DRAMA Collection A collection of small (sub-1B) multilingual dense retrievers that generalize well across a number of tasks and languages. • 3 items • Updated Feb 26 • 7
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions Paper • 2502.13124 • Published Feb 18 • 6
OpenR1-Math Collection Dataset and SFT model distilled from DeepSeek-R1. Check out our blog post for more details: https://huggingface.co/blog/open-r1/update-2 • 3 items • Updated 28 days ago • 9
Llasa Collection TTS foundation model compatible with Llama framework (160k hours tokenized speech data released) • 11 items • Updated 30 days ago • 18
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 3 items • Updated 21 days ago • 114
view article Article SmolVLM - small yet mighty Vision Language Model By andito and 4 others • Nov 26, 2024 • 310