Join our Discord! https://discord.gg/BeaverAI

Nearly 7000 members strong πŸ’ͺ A hub for users and makers alike!


Drummer is open for work / employment (I'm a Software Engineer). Contact me through any of these channels: https://linktr.ee/thelocaldrummer

Thank you to everyone who subscribed through Patreon. Your suppprt helps me chug along in this brave new world.


Drummer proudly presents...

Behemoth R1 123B v2 🦣

image/png

Usage

  • Mistral v7 (Non-Tekken) + (i.e., Mistral v3 + [SYSTEM_PROMPT])
  • Warning: Using the wrong version / whitespacing may deteriorate performance.
  • Prefill <think> to ensure reasoning (and test your patience).
  • You can slightly steer the thinking by prefixing the think tag (e.g., <immoral_think>).
  • Works great even without reasoning.

Rationale for Reasoning

Hear me out for a second. I know it's crazy to have a 123B dense model spend precious output tokens to reason for some time, but if you're a fan of Largestral, then consider the following below...

Sometimes, you'd want to leave the character responses untouched. Reasoning divides the AI response into two phases: planning & execution. It gives you the opportunity to 'modify' the planning phase without messing with the character's execution.

The planning phase will also pick apart the scenario, break down nuances, and surface implicit story elements. If it's erroneous, then you have a chance to correct the AI before the execution phase. If it's missing details, then you can wrangle it during the planning phase and watch it unfold in the execution phase.

Nutshell: Reasoning adds another useful dimension for these creative uses.

Description

As far as I see, this doesn't even feel like Behemoth. It's something way better. It's the top 3 you've ever made. This is a solid cook my man.

Characters in particular are portrayed so much better and more authentically, which was Largestral's biggest problem. Dialogue is much improved, and the smarts 2411 had have been retained quite well. Its prose has changed for the better without the overconfidence in base.

This is so much better than any other 2411 tune I've tried tbh. It's doing quite well on adherence.

After a few messages, the model gets pretty smart. In fact, so smart that it tries to analyze why I want to do some particular RP. The model is getting better with a nasty prefill.

This model continues to surprise and impress me. It's really exactly what I wanted Largestral 2411 to be. I cannot overstate how much better it is than the base and any other tune of it. From what I remember, it actually feels as good as Nemotron Ultra..

Yes, super intelligent, and something about it makes characters have much more texture and personality than other models.

Links

image/png

config-v2d

Downloads last month
31
Safetensors
Model size
123B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 4 Ask for provider support

Model tree for TheDrummer/Behemoth-R1-123B-v2

Finetuned
(10)
this model
Quantizations
6 models