You should look up on Unsloth project
Djuunaa
djuna
AI & ML interests
None yet
Recent Activity
new activity
1 day ago
djuna/TEST-Q2.5-Lenned-14B:Update config.json
updated
a model
1 day ago
djuna/TEST-Q2.5-Lenned-14B
Organizations
djuna's activity
Update config.json
#1 opened 1 day ago
by
djuna
replied to
davidberenstein1957's
post
2 days ago
it's looks like mistral v3 tekken format
replied to
davidberenstein1957's
post
2 days ago
i tried to tokenize it, and those trailing space is not is not part of ":" token.
replied to
davidberenstein1957's
post
3 days ago
Maybe not with the trailing spaces?
reacted to
davidberenstein1957's
post with 👀
3 days ago
Post
1686
Let's uncover the post-training dataset from DeepSeek-R1 with Magpie!
Pass pre-query tokens
We can get realistic examples!
Gist: https://gist.github.com/davidberenstein1957/3f20046ce57395a6aba13f8b4e956b59
Pass pre-query tokens
<|begin▁of▁sentence|>User:
, let the model generate the rest.We can get realistic examples!
Gist: https://gist.github.com/davidberenstein1957/3f20046ce57395a6aba13f8b4e956b59
reacted to
hbseong's
post with 👀
3 days ago
Post
929
🚨🔥 New Release Alert! 🔥🚨
Introducing the 435M model that outperforms Llama-Guard-3-8B while slashing 75% of the computation cost! 💻💥
👉 Check it out: hbseong/HarmAug-Guard (Yes, INFERENCE CODE INCLUDED! 💡)
More details in our paper: https://arxiv.org/abs/2410.01524 📜
#HarmAug #LLM # Safety #EfficiencyBoost #Research #AI #MachineLearning
Introducing the 435M model that outperforms Llama-Guard-3-8B while slashing 75% of the computation cost! 💻💥
👉 Check it out: hbseong/HarmAug-Guard (Yes, INFERENCE CODE INCLUDED! 💡)
More details in our paper: https://arxiv.org/abs/2410.01524 📜
#HarmAug #LLM # Safety #EfficiencyBoost #Research #AI #MachineLearning
reacted to
lewtun's
post with 🔥
3 days ago
Post
9373
We are reproducing the full DeepSeek R1 data and training pipeline so everybody can use their recipe. Instead of doing it in secret we can do it together in the open!
🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.
🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.
🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.
Follow along: https://github.com/huggingface/open-r1
🧪 Step 1: replicate the R1-Distill models by distilling a high-quality reasoning corpus from DeepSeek-R1.
🧠 Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero. This will involve curating new, large-scale datasets for math, reasoning, and code.
🔥 Step 3: show we can go from base model -> SFT -> RL via multi-stage training.
Follow along: https://github.com/huggingface/open-r1