Korek Rybens
Rybens

·
AI & ML interests
None yet
Recent Activity
liked
a Space
3 days ago
smolagents/smolagents-leaderboard
liked
a model
18 days ago
OpenPipe/Deductive-Reasoning-Qwen-32B
reacted
to
burtenshaw's
post
with 🤗
22 days ago
I’m super excited to work with @mlabonne to build the first practical example in the reasoning course.
🔗 https://huggingface.co/reasoning-course
Here's a quick walk through of the first drop of material that works toward the use case:
- a fundamental introduction to reinforcement learning. Answering questions like, ‘what is a reward?’ and ‘how do we create an environment for a language model?’
- Then it focuses on Deepseek R1 by walking through the paper and highlighting key aspects. This is an old school way to learn ML topics, but it always works.
- Next, it takes to you Transformers Reinforcement Learning and demonstrates potential reward functions you could use. This is cool because it uses Marimo notebooks to visualise the reward.
- Finally, Maxime walks us through a real training notebook that uses GRPO to reduce generation length. I’m really into this because it works and Maxime took the time to validate it share assets and logging from his own runs for you to compare with.
Maxime’s work and notebooks have been a major part of the open source community over the last few years. I, like everyone, have learnt so much from them.
Organizations
Rybens's activity
Performance Raport in my project
2
#1 opened 7 months ago
by
Rybens
Other models updates and ggufs
17
#1 opened 8 months ago
by
Rybens
could you also release gguf of codestral 7b?
2
#3 opened 8 months ago
by
txhno

GGUF Version?
20
#1 opened about 1 year ago
by
johnnnna
larger gguf quantized versions
3
#1 opened about 1 year ago
by
johnnnna
GGUF Version
18
#1 opened about 1 year ago
by
SimSim93
Best 7B model for my use case
#1 opened over 1 year ago
by
Rybens
Add Diffusers weights
#19 opened over 2 years ago
by
Rybens