Wur doomed!

#14
by jukofyork - opened

Continuation of THE THREAD OF DOOM.

jukofyork pinned discussion

What do you and the others think of the distilled R1 models for writing?

The llama3 / qwen models SFT'd on R1 outputs? I only tried 2 of them.

R1 Qwen (32b) - Lacks knowledge of fiction (same as the official Qwen release), so it's writing is no better.

R1 Llama3 - This is generally the worst of them (not just for writing). It'll generate the CoT and then write something completely different.

CoT traces won't let the model do anything out of distribution, so not very useful if the base model doesn't have a lot in it's training data.

Yeah, I have tried the same two and felt the same way.

I also felt that any attempt to add an R1 distill to the merge recipe of an existing merge project made it worse...so far...

@gghfez @BigHuggyD that has been my experience as well, which is a shame as I had a go of R1 on Openrouter and I was blown away.

What model is anywhere close that is usable on a 24gb vram machine with 32gb of ram in your experience?

There's nothing like it for now. I'm running R1 slowly on my ThreadRipper:

prompt eval time =   14026.61 ms /   918 tokens (   15.28 ms per token,    65.45 tokens per second)
       eval time =  398806.12 ms /  1807 tokens (  220.70 ms per token,     4.53 tokens per second)
      total time =  412832.73 ms /  2725 tokens

I tried training Wizard2 8x22b MoE on R1 data, but it doesn't really work well. It will plan ahead in think tags eg:

I need to ensure the story maintains its gritty, realistic tone without becoming overly melodramatic. The characters' growth should be subtle but significant. Also, the ending should leave a sense of hope but not be too neat—their redemption is fragile, and the future is uncertain.

Let me outline the next few chapters:

Chapter 5: Nightmares and Trust
...

But it doesn't backtrack like R1 does. Just kind of agrees with it's self and ends up writing how it usually would:

“I don’t know what I want anymore,” she admitted, voice barely above a whisper as rain tapped against corrugated roofing overhead.

lol

Ahhh thats a shame :-(

"I don’t know what I want anymore,” she admitted, voice barely above a whisper as rain tapped against corrugated roofing overhead."

Oh god!

I'll have to keep an eye on this thread.

I did enjoy Ppoyaa/MythoNemo-L3.1-70B-v1.0

But my tastes are probably not as refined as others on this thread ;-)

My prompts that ask to identify the boundaries where the book begins and then where the end-matter begins, still work well with 4o-mini (and 4o as a fallback for the tougher cases).

You try Gemini? I find it always seems to beat OpenAI models at long context, even with how much they've deep-fried Pro. Might be a decent task for Jamba, actually...

I tried a few models back when I wrote the script/prompts and couldn't really get them to work reliably sadly. It's not easy to use a local model for either as 15000 files with first/last 4k characters (upped to first/last 16k characters for the second pass).

@Downtown-Case Kimi-K2 Q2_KS knows it with your prompt

Sir John Babcock – The protagonist, a British aristocrat and occultist who becomes entangled in Illuminati intrigues. His journey involves encounters with secret societies, mystical revelations, and shifting identities.
Albert Einstein – Appears as a character who discusses metaphysics and relativity with Babcock, blurring the lines between fiction and historical reality.
James Joyce – The famous author is depicted as a magus-like figure, weaving linguistic and mystical insights into the plot.
Carl Jung – Engages in psychological and archetypal discussions, particularly regarding synchronicity and the collective unconscious.
The Illuminati – Not a single character but a shadowy, multi-faceted conspiracy that manipulates events. Wilson presents them as both real and imagined, embodying paranoia and enlightenment.
Hagbard Celine – A recurring figure from Wilson’s Illuminatus! trilogy, appearing here as a trickster-mentor who guides Babcock through layers of reality.
Morrigan and the Goddess Eris – Personifications of chaos and discord, central to Wilson’s Discordian themes.
The Dealy Lama – A cryptic guru figure who dispenses paradoxical wisdom, possibly a stand-in for Wilson himself.
Various Historical & Mythic Figures – Aleister Crowley, William Blake, and even Cagliostro appear in dreamlike sequences, reinforcing the novel’s theme of masks and shifting identities.

(That looks like an interesting book lol)

@jukofyork

upped to first/last 16k characters for the second pass

So you need 32k characters + prompt/response? Then yeah, local models don't really make sense, it'd tie up your rig for days unless you could get something tiny like command-r7b in vllm could do it.

@Downtown-Case Kimi-K2 Q2_KS knows it with your prompt

Sir John Babcock – The protagonist, a British aristocrat and occultist who becomes entangled in Illuminati intrigues. His journey involves encounters with secret societies, mystical revelations, and shifting identities.
Albert Einstein – Appears as a character who discusses metaphysics and relativity with Babcock, blurring the lines between fiction and historical reality.
James Joyce – The famous author is depicted as a magus-like figure, weaving linguistic and mystical insights into the plot.
Carl Jung – Engages in psychological and archetypal discussions, particularly regarding synchronicity and the collective unconscious.
The Illuminati – Not a single character but a shadowy, multi-faceted conspiracy that manipulates events. Wilson presents them as both real and imagined, embodying paranoia and enlightenment.
Hagbard Celine – A recurring figure from Wilson’s Illuminatus! trilogy, appearing here as a trickster-mentor who guides Babcock through layers of reality.
Morrigan and the Goddess Eris – Personifications of chaos and discord, central to Wilson’s Discordian themes.
The Dealy Lama – A cryptic guru figure who dispenses paradoxical wisdom, possibly a stand-in for Wilson himself.
Various Historical & Mythic Figures – Aleister Crowley, William Blake, and even Cagliostro appear in dreamlike sequences, reinforcing the novel’s theme of masks and shifting identities.

(That looks like an interesting book lol)

It's a pretty strange book and I only made it around 1/3rd of the way through, but wanted to pick it up again and had completely forgotten the name or author lol.

I think I found it originally by looking at some Reddit threads where people asked for similar books to The Club Dumas by Arturo Pérez-Revert (if you're ever seen the movie called The Ninth Gate, then half the book is the story of the Ninth gate intermixed with another linked story about Alexandre Dumas).

@jukofyork

upped to first/last 16k characters for the second pass

So you need 32k characters + prompt/response? Then yeah, local models don't really make sense, it'd tie up your rig for days unless you could get something tiny like command-r7b in vllm could do it.

Yeah, it took nearly a day to finish.

I think it was worth it though as I've now got 2 datasets for the +1 class, 2 datasets for the -1 class, and each dataset is duplicated: once for random paragraphs and once for random stories/chapters.

It has been running a day so far (5 more days to go as used rank-64 and nearly 1B tokens this time), but it is clearly not finding it as easy to pick up on formatting patterns:

Screenshot_20250723-005501.png

Just tried Kiwi FP8 for the first time

A soft knock. Three measured taps, almost hesitant. Then the door eases open without waiting for your word, and she steps inside.
Queen Elara.

NOOOOooooo.....

Just tried Kiwi FP8 for the first time

A soft knock. Three measured taps, almost hesitant. Then the door eases open without waiting for your word, and she steps inside.
Queen Elara.

NOOOOooooo.....

Kimi?

I tried my using write the opening chapter of a Grimdark trilogy prompt, and it just never stopped and wanted to write the whole book/trilogy in one go lol.

I haven't really tested it for coding yet. I don't use any of the fancy MCP/agent stuff for code so will be interesting to see how it works.

From the little in have tested it, I'm almost certain Kimi was trained on o3 code as it has the infuriating desire to turn all your comments into this:

\\ ------------------------------------------------------------------------------------------------------------------------
\\ blah blah
\\ ------------------------------------------------------------------------------------------------------------------------

Only o3 does this and it drives me nuts.

Just tried Kiwi FP8 for the first time

A soft knock. Three measured taps, almost hesitant. Then the door eases open without waiting for your word, and she steps inside.
Queen Elara.

NOOOOooooo.....

Kimi?

I tried my using write the opening chapter of a Grimdark trilogy prompt, and it just never stopped and wanted to write the whole book/trilogy in one go lol.

I haven't really tested it for coding yet. I don't use any of the fancy MCP/agent stuff for code so will be interesting to see how it works.

Elara Voss?

Sign up or log in to comment