oussama yousr's picture

3 1

oussama yousr

oussama

·

AI & ML interests

None yet

Recent Activity

reacted to prithivMLmods's post with 🚀 about 15 hours ago

Reasoning SmolLM2 🚀 🎯Fine-tuning SmolLM2 on a lightweight synthetic reasoning dataset for reasoning-specific tasks. Future updates will focus on lightweight, blazing-fast reasoning models. Until then, check out the blog for fine-tuning details. 🔥Blog : https://huggingface.co/blog/prithivMLmods/smollm2-ft 🔼 Models : + SmolLM2-CoT-360M : https://huggingface.co/prithivMLmods/SmolLM2-CoT-360M + Reasoning-SmolLM2-135M : https://huggingface.co/prithivMLmods/Reasoning-SmolLM2-135M + SmolLM2-CoT-360M-GGUF : https://huggingface.co/prithivMLmods/SmolLM2-CoT-360M-GGUF 🤠 Other Details : + Demo : https://huggingface.co/prithivMLmods/SmolLM2-CoT-360M/blob/main/Demo/SmolLM2%20Demo.ipynb + Fine-tune nB : https://huggingface.co/prithivMLmods/SmolLM2-CoT-360M/blob/main/finetune/SmolLM-FT.ipynb

reacted to fdaudens's post with ❤️ about 15 hours ago

Yes, DeepSeek R1's release is impressive. But the real story is what happened in just 7 days after: - Original release: 8 models, 540K downloads. Just the beginning... - The community turned those open-weight models into +550 NEW models on Hugging Face. Total downloads? 2.5M—nearly 5X the originals. The reason? DeepSeek models are open-weight, letting anyone build on top of them. Interesting to note that the community focused on quantized versions for better efficiency & accessibility. They want models that use less memory, run faster, and are more energy-efficient. When you empower builders, innovation explodes. For everyone. 🚀 The most popular community model? @bartowski's DeepSeek-R1-Distill-Qwen-32B-GGUF version — 1M downloads alone.

reacted to lewtun's post with ❤️ about 15 hours ago

Introducing OpenR1-Math-220k! https://huggingface.co/datasets/open-r1/OpenR1-Math-220k The community has been busy distilling DeepSeek-R1 from inference providers, but we decided to have a go at doing it ourselves from scratch 💪 What’s new compared to existing reasoning datasets? ♾ Based on https://huggingface.co/datasets/AI-MO/NuminaMath-1.5: we focus on math reasoning traces and generate answers for problems in NuminaMath 1.5, an improved version of the popular NuminaMath-CoT dataset. 🐳 800k R1 reasoning traces: We generate two answers for 400k problems using DeepSeek R1. The filtered dataset contains 220k problems with correct reasoning traces. 📀 512 H100s running locally: Instead of relying on an API, we leverage vLLM and SGLang to run generations locally on our science cluster, generating 180k reasoning traces per day. ⏳ Automated filtering: We apply Math Verify to only retain problems with at least one correct answer. We also leverage Llama3.3-70B-Instruct as a judge to retrieve more correct examples (e.g for cases with malformed answers that can’t be verified with a rules-based parser) 📊 We match the performance of DeepSeek-Distill-Qwen-7B by finetuning Qwen-7B-Math-Instruct on our dataset. 🔎 Read our blog post for all the nitty gritty details: https://huggingface.co/blog/open-r1/update-2

View all activity

Organizations

None yet

spaces 2

LayoutLMv1

LyoutLMv3 Invoice

models 2

oussama/layoutlmv3-finetuned-invoice

Token Classification • Updated Aug 10, 2023 • 133 • 5

oussama/Layoutlm_Form_information_extraction

Token Classification • Updated Aug 8, 2023 • 15

datasets

None public yet