Safeword/Abomination 36B/70B 4.1

#784
by sleepdeprived3 - opened

Thanks a lot for continue creating such awesome models. I'm looking forward a lot to give them a try. I'm especially excited for the second attempt of your 70B model. I queued them all. As always you can follow their status under https://hf.tst.eu/status.html

They will appear on the download page under:

@mradermacher Just look at https://huggingface.co/ReadyArt/Fallen-Safeword-70B-R1-v4.1 to see a beautiful model card. They not only managed to design the readme like their own website but even put an animated webp on there.

We just changed the tokenizer_config.json
"eos_token": "<|eot_id|>",
to
"eos_token": "<|end▁of▁sentence|>",
That should fix the token leakage.

Oh no I nukeall and requeued all of them. Luckely they where imatrix blocked due to blocked/budget thanks to DeepSeek-V2.5-236B so not much work was lost.

Indeed, the 70B one is near the top of my list to try :)

mradermacher changed discussion status to closed

I'm looking forward to knowing how it goes. I don't personally run anything that my own 4090 can't run and made the 70B by request for friends but this llama/deepseek architecture of the Fallen model it's trained on is totally different from what I've been training.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment