Text Generation
Transformers
Safetensors
qwen3_moe
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prose
vivid writing
Mixture of Experts
mixture of experts
128 experts
8 active experts
fiction
roleplaying
bfloat16
rp
qwen3
horror
finetune
thinking
reasoning
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -52,7 +52,7 @@ The source code can also be used directly.
|
|
52 |
|
53 |
ABOUT:
|
54 |
|
55 |
-
Qwen's excellent "Qwen3-30B-A3B" with Brainstorm 20x in a MOE at 42B parameters.
|
56 |
|
57 |
This pushes Qwen's model to the absolute limit for creative use cases.
|
58 |
|
@@ -321,3 +321,60 @@ EXAMPLE #4 - temp 1.2
|
|
321 |
|
322 |
OUTPUT:
|
323 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
ABOUT:
|
54 |
|
55 |
+
Qwen's excellent "Qwen3-30B-A3B" with Brainstorm 20x (tech notes at bottom of the page) in a MOE at 42B parameters.
|
56 |
|
57 |
This pushes Qwen's model to the absolute limit for creative use cases.
|
58 |
|
|
|
321 |
|
322 |
OUTPUT:
|
323 |
|
324 |
+
---
|
325 |
+
|
326 |
+
<H2>What is Brainstorm?</H2>
|
327 |
+
|
328 |
+
<B>Brainstorm 20x</B>
|
329 |
+
|
330 |
+
The BRAINSTORM process was developed by David_AU.
|
331 |
+
|
332 |
+
Some of the core principals behind this process are discussed in this <a href="https://arxiv.org/pdf/2401.02415">
|
333 |
+
scientific paper : Progressive LLaMA with Block Expansion </a>.
|
334 |
+
|
335 |
+
However I went in a completely different direction from what was outlined in this paper.
|
336 |
+
|
337 |
+
What is "Brainstorm" ?
|
338 |
+
|
339 |
+
The reasoning center of an LLM is taken apart, reassembled, and expanded.
|
340 |
+
|
341 |
+
In this case for this model: 20 times
|
342 |
+
|
343 |
+
Then these centers are individually calibrated. These "centers" also interact with each other.
|
344 |
+
This introduces subtle changes into the reasoning process.
|
345 |
+
The calibrations further adjust - dial up or down - these "changes" further.
|
346 |
+
The number of centers (5x,10x etc) allow more "tuning points" to further customize how the model reasons so to speak.
|
347 |
+
|
348 |
+
The core aim of this process is to increase the model's detail, concept and connection to the "world",
|
349 |
+
general concept connections, prose quality and prose length without affecting instruction following.
|
350 |
+
|
351 |
+
This will also enhance any creative use case(s) of any kind, including "brainstorming", creative art form(s) and like case uses.
|
352 |
+
|
353 |
+
Here are some of the enhancements this process brings to the model's performance:
|
354 |
+
|
355 |
+
- Prose generation seems more focused on the moment to moment.
|
356 |
+
- Sometimes there will be "preamble" and/or foreshadowing present.
|
357 |
+
- Fewer or no "cliches"
|
358 |
+
- Better overall prose and/or more complex / nuanced prose.
|
359 |
+
- A greater sense of nuance on all levels.
|
360 |
+
- Coherence is stronger.
|
361 |
+
- Description is more detailed, and connected closer to the content.
|
362 |
+
- Simile and Metaphors are stronger and better connected to the prose, story, and character.
|
363 |
+
- Sense of "there" / in the moment is enhanced.
|
364 |
+
- Details are more vivid, and there are more of them.
|
365 |
+
- Prose generation length can be long to extreme.
|
366 |
+
- Emotional engagement is stronger.
|
367 |
+
- The model will take FEWER liberties vs a normal model: It will follow directives more closely but will "guess" less.
|
368 |
+
- The MORE instructions and/or details you provide the more strongly the model will respond.
|
369 |
+
- Depending on the model "voice" may be more "human" vs original model's "voice".
|
370 |
+
|
371 |
+
Other "lab" observations:
|
372 |
+
|
373 |
+
- This process does not, in my opinion, make the model 5x or 10x "smarter" - if only that was true!
|
374 |
+
- However, a change in "IQ" was not an issue / a priority, and was not tested or calibrated for so to speak.
|
375 |
+
- From lab testing it seems to ponder, and consider more carefully roughly speaking.
|
376 |
+
- You could say this process sharpens the model's focus on it's task(s) at a deeper level.
|
377 |
+
|
378 |
+
The process to modify the model occurs at the root level - source files level. The model can quanted as a GGUF, EXL2, AWQ etc etc.
|
379 |
+
|
380 |
+
---
|