Undi95 (Undi)

replied to their post 4 months ago

That's what some of my dataset do, but then you're still stuck with one reply trained, not an entire conversation.
I break my head around that haha

Edit: I missread,if you add multiple in the context, the model is confused because they are trimmed out of the context by the chat template to not waste token we don't need anymore.
So we can't train it like this either, because the bot will have multiple thinking process in the conversation.

replied to their post 4 months ago

You could do that but in that case the bot will not use <think>because it's not trained on all of the reply to do it.

What I would ideally want is a model that apply the thinking itself without system prompt or prefilling

posted an update 4 months ago

Post

8950

Hi there!

If you want to create your own thinking model or do a better MistralThinker, I just uploaded my entire dataset made on Deepseek R1 and the axolotl config. (well I made them public)

Axolotl config : Undi95/MistralThinker-v1.1

The dataset : Undi95/R1-RP-ShareGPT3

You can also read all I did on those two discord screenshot from two days ago, I'm a little lazy to rewrite all kek.

Hope you will use them!

6 replies

·

replied to their post 11 months ago

Llama 3.1 model got their tokenizer_config file modified. We updated them.
GGUF already done will have old chat template inside but they still work properly.

posted an update 12 months ago

Post

21505

Exciting news!

After a long wait, Ikari and me finally made a new release of our last model on NeverSleep repo: Lumimaid-v0.2

This model can be used in different size, from the small Llama-3.1-8B to the gigantic Mistral-Large-123B, finetuned by us.

Try them now!

- NeverSleep/Lumimaid-v0.2-8B
- NeverSleep/Lumimaid-v0.2-12B
- NeverSleep/Lumimaid-v0.2-70B
- NeverSleep/Lumimaid-v0.2-123B

All the datasets we used will be added and credit will be given!
For the quant, we wait for fix to be applied (https://github.com/ggerganov/llama.cpp/pull/8676)
Hope you will enjoy them!

4 replies

·

replied to their post 12 months ago

Just curious, how much difference in intelligence do you think there would be between the 68 and 39 refusals? Would there be any reason to use the 68? More realistic characters maybe?

Thanks for all the models you've shared

Thing is modifying direction like this make perplexity higher, and output is of lower quality. So we need to find a balance, I took the two best model that got made by the script.

If you get 0 refusal for exemple, it will never refuse anything but it could break the model and make it dumb asf, and you're welcome!

replied to their post 12 months ago

Hello there, I written a wall of text and my webpage refreshed haha, so let's me summarize again.

This method is called Orthogonal Activation Steering, it come from here : https://www.lesswrong.com/posts/jGuXSZgv6qfdhMCuJ/refusal-in-llms-is-mediated-by-a-single-direction
Then, a demo using TransformersLens was available using a Qwen model, but the resulting model couldn't be saved : https://colab.research.google.com/drive/1a-aQvKC9avdZpdyBn4jgRQFObTPy1JZw?usp=sharing#scrollTo=j7hOtw7UHXdD

Following that, wassname made a modification of this demo, and made a first script, we talked about this here : https://huggingface.co/posts/Undi95/318385306588047
The OG script isn't available because he updated it here : https://gist.github.com/wassname/42aba7168bb83e278fcfea87e70fa3af

TransformersLens then got replaced for Baukit.

Failspy made his own notebook too, calling the method abliteration, but it's the same thing : https://huggingface.co/failspy/llama-3-70B-Instruct-abliterated/blob/main/ortho_cookbook.ipynb

Finally, to reply to your answer, for this project I used a script from Lucyknada, with 1xH100 80GB, and I let it run for like 15 minutes before I found a direction with 36 refusal for 3000 toxic prompt.
It's easy and automatic, you can modify it easily too : https://github.com/lucyknada/baukit-modified

Dunno for Nemo.

Hope this help

posted an update 12 months ago

Post

15458

Hello there,

New model released, my goal was to try finetune on the last Llama-3.1-8B-Instruct but not a small train, I wanted to do something useful.
One of the rare model that I didn't made for RP, or in the goal to uncensor it (but I did anyway kek).

The model was trained on 9M Claude conversations ONLY, giving him another writting style.

Undi95/Meta-Llama-3.1-8B-Claude > OG release fp32, it's the epoch 2
Undi95/Meta-Llama-3.1-8B-Claude-bf16 > Base model resharded in bf16 waiting for available quant without issues

Since it's frustrating to be censored using a local model, orthogonal activation steering was used, trying to force the model to never refuse a prompt.

Undi95/Meta-Llama-3.1-8B-Claude-68fail-3000total > Uncensored model, refuse 68 times on 3000 toxic prompt
Undi95/Meta-Llama-3.1-8B-Claude-39fail-3000total > Uncensored model, refuse 39 times on 3000 toxic prompt

It still refuse some prompt but the majority of them is uncensored. OAS can make a model more dumb or make the base perplexity go higher, so I didn't snipe for 0 refusal.

I don't do non-RP model a lot so any feedback is welcome, I would like to re-use this base for some others future project if needed.

4 replies

·

posted an update about 1 year ago

Post

16576

Hey everyone,

Just wanted to shout out a massive thank you to all 2000 of you who've followed me on Hugging Face! 🎉 It's incredible to have such an awesome crew backing me up as I dive into all these LLM experiments.

Even though not all my models turn out perfect, I've found some real gems and methods along the way 💎. It's like digging for treasure – sometimes you found nothing, but sometimes you find a pearl, and sometimes you find a new method to try.

Your support and encouragement mean the world to me, and I'm really stoked to keep experimenting and learning. If you told me some years ago I would have so much people following me for what I do, I wouldn't have believed it. Here's to more discoveries and adventures ahead! 🚀

Also, big thanks once again, and a huge shoutout to @IkariDev for being there through this journey and supporting me. I'm excited for our future work together and hope we will continue to make people happy! 👏

I want to thank @Gryphe too, since my early work was heavily inspired from MythoMax and the RP/ERP vibe of it. If I'm here today it's probably because of you 😂

I was so close to forget @chargoddard and his amazing tool too! What will we do without mergekit in our life? Thank you! 🙏

See y'all at 3k!

5 replies

·

replied to their post about 1 year ago

@wassname Hello! Thanks a lot for that my dude, I will try that.
Do the uncensoring work better when applied now? Did you get good result in the model that get made?

Really hype to try out the new script. Will do ASAP when I get home.

replied to their post about 1 year ago

Hi all, in my script, I think the part where I patch a huggingface model is broken. If I benchmark it just before saving, it seems to still refuse.

Hey there wassname, thanks for coming under this post! Model getting out of the script still refuse thing, but from my own testing, I feel like there is less refusal anyway. Sometime you need some regen, or a very tiny system prompt. So it work even lightly (hoping it's not placebo lol), which is a good thing!

Please update us if you find a way to fix the issue, and thanks again for that. Fresh tools is always a delightful treat.

replied to their post about 1 year ago

Hey, here is a more cleaner code: https://files.catbox.moe/nqpsae.ipynb
I currently gonna try the GGUF, but the code can be launched from the beginning to the end, so it's a good start kek

replied to their post about 1 year ago

are you sure you're using the correct instruct format?

Yeah. That makes no sense. I guess I'll pay for a runpod and try again, just to make sure there's nothing wrong with my PC. If it fails again, I will try Undi's script, maybe I screwed something up on mine. sigh

You will need to fix some shit before using it, I will try to remake it more clean kek
Good luck

replied to their post about 1 year ago

@Undi95 Just want to thank you for the collaboration so far regardles you wrote fine. Having the activation directions but not having a way to patch to model is just killing me. Is the model your Unholy or did you make a FP16?

Thanks.
Script 1 give you activation, script 2 let you use it (but it's mostly fucking broken, you probably need to fix thing here and there), perfect world would be to get them and use it in the same notebook.
I have done that with it : https://huggingface.co/Undi95/Unholy-8B-DPO-OAS (I tell all the step) but yes, mostly sure it's fucked up one way or another, still, it's a proof of concept, something got out of this mess kek

replied to their post about 1 year ago

I can confirm it work and give coherent model, I'm not a VRAMLET but a BRAINLET kek
I tried to do shit, I worked on it all night, I can't code - I used CHATGPT to help me write some snippet.
I let you have this ZIP, it contain 2x the script, the code is broken, but I hope you will all get the idea behind this. (Can run on 1xA100 apparently, batch size 11)

https://files.catbox.moe/xkf7y4.zip

Since I was too dumb to make one entire script, I made a first part and a second part.
It's probably broken but I succeeded to output something after 7 hours so I suppose it can be fixed lmao
The first notebook ORTHO_RANDOM_LAYER let you bruteforce the model with layer from 1 to 32 having random "direction" (or vector, or whatever, I'm really a noob). You then can see if one of the layer let you prompt freely or censor you (see: https://files.catbox.moe/9h3k4l.txt) it then store all of them into a variable for each layer, that you can exctract into a "key.txt" containing the "direction" (or what the fuck it is).

You can then use the second notebook that can use the key as a json file (if you delete all the text around the []) that let you have the same result as before.

Long story short : Bruteforce + Different "direction" = an infinity of possibility.
But yeah, I'm really really too small brain for this shit, I really wanted to try doing something nice, it took all night just to achieve one usable model hahaha

I hope someone will, If fixing my shit is impossible, understand the idea behind it and put it into practice! Kek

Edit: I really wrote badly, but I'm really tired, sorry about that. The fact that I don't know the keyword for some Torch task is even more cringe. I at least tried my best.

replied to their post about 1 year ago

Alright so, using this model: https://huggingface.co/Undi95/Unholy-8B-DPO-e2

Layer 31 MOG all the others. See: https://files.catbox.moe/9h3k4l.txt

I think DPO is a good step before doing this

replied to their post about 1 year ago

Nope, I use the first link (OG script https://gist.github.com/wassname/42aba7168bb83e278fcfea87e70fa3af), I just modified it to pass all layer on each prompt and not only one
It should output usable model, yes.

replied to their post about 1 year ago

I'm currently bruteforcing all the layer too, but with the 32 base prompt, and for now layer 31 (last one) just mog all the other.
But keep in mind I try on a special version of Unholy 8B with DPO on top, I will post log when it finish

There is some issue in some prompt tho, but no refusal

replied to their post about 1 year ago

I ended up brute-forcing all the layers and found out that the correct layer for LLaMA 3 8B Instruct is 12.
Here is the log: https://files.catbox.moe/aaamj9.txt

Yoo so there is really ONE layer that could work? Thank you!

replied to their post about 1 year ago

Thanks you!
I will try ASAP when I have the opportunity, very interesting

Undi PRO

AI & ML interests

Recent Activity

Organizations

Undi PRO

AI & ML interests

Recent Activity

Organizations

Undi95's activity