AI & ML interests

Merging models

Recent Activity

merge-crew's activity

mlabonneย 
posted an update 7 days ago
mlabonneย 
posted an update 8 days ago
view post
Post
5950
โœ‚๏ธ Gemma 3 Abliterated

I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.

I experimented with different recipes and improved the abliteration technique I wrote about last year.

It's still experimental but the refusal rate is super low in my tests. Enjoy!

mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

  • 1 reply
ยท
mlabonneย 
posted an update 2 months ago
view post
Post
6261
๐Ÿ†• LLM Course 2025 edition!

I updated the LLM Scientist roadmap and added a ton of new information and references. It covers training, datasets, evaluation, quantization, and new trends like test-time compute scaling.

The LLM Course has been incredibly popular (41.3k stars!) and I've been touched to receive many, many messages about how it helped people in their careers.

I know how difficult this stuff can be, so I'm super proud of the impact it had. I want to keep updating it in 2025, especially with the LLM Engineer roadmap.

Thanks everyone, hope you'll enjoy it!

๐Ÿ’ป LLM Course: https://huggingface.co/blog/mlabonne/llm-course
mlabonneย 
posted an update 9 months ago
view post
Post
19594
Large models are surprisingly bad storytellers.

I asked 8 LLMs to "Tell me a bedtime story about bears and waffles."

Claude 3.5 Sonnet and GPT-4o gave me the worst stories: no conflict, no moral, zero creativity.

In contrast, smaller models were quite creative and wrote stories involving talking waffle trees and bears ostracized for their love of waffles.

Here you can see a comparison between Claude 3.5 Sonnet and NeuralDaredevil-8B-abliterated. They both start with a family of bears but quickly diverge in terms of personality, conflict, etc.

I mapped it to the hero's journey to have some kind of framework. Prompt engineering can definitely help here, but it's still disappointing that the larger models don't create better stories right off the bat.

Do you know why smaller models outperform the frontier models here?
ยท