This has replaced the base model finally for me for RP and story-writing purposes!

#2
by Cypherfox - opened

This is a great model, retaining the 'smarts' of the base model, and adding even better, more vivid, and creative writing on top. I'm sure it's lost some other pieces, but for what I used Mistral Large 2411 for, this model is just better.

The previous Behemoths didn't...quite measure up. They slid into slop and cliches really easily, unless you rather forcefully guided it away. In that, it was worse than the base model, which is why I just standardized on the base ML2411 for...well, a very long time.

I'm running this at Q5 and it's a high quality model. Thanks for returning to the classics, and making them so much better! I hope someday to be able to run this at Q8! 🤣

Do you think I made it smarter? Just curious

Do you think I made it smarter? Just curious

Yes, each new version you release brings me a completely fresh experience. My dedication to the latest version leaves me no choice but to delete the older ones.

I'm not sure I'd say 'smarter'; I'm fairly confident that it's less capable at things like tool-using, but from a creative writing capacity, it's MUCH better. It tracks characters and storylines much better, keeps consistencies, and is very steerable. I haven't encountered a refusal yet, although I haven't been trying.

So it depends on what you're measuring. I'd say it's much more 'human', and it finds the most fascinating things to say sometimes, making connections that weren't expected, and the base model wouldn't have made. So it has a higher EQ, at the very least!

Cool! I'm currently tuning the 2407 version (calling it Behemoth ReduX) just to see if creativity improves. 123B is generally large enough for coherent storytelling and I wanted to tap into 2407's creativity, since 2411 ruined it with slop and narrow writing.

I have not had much 'slop' from this Behemoth. Earlier today it created an amazingly vivid scene with surprisingly creative language and sentence length from just a few implicit descriptions (and the context up to that point) and tracked the state of the people in the scene perfectly. I wasn't always able to trust ML2411 to write well, but I'm getting used to this one doing that.

For reference, I use a 1.15 temperature and the DRY (0.8, 1.75, 2 tokens) module and 5 bit quants.

Sign up or log in to comment