Text Generation
Transformers
Safetensors
qwen3_moe
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prose
vivid writing
Mixture of Experts
mixture of experts
64 experts
8 active experts
fiction
roleplaying
bfloat16
rp
qwen3
horror
finetune
thinking
reasoning
Merge
uncensored
abliterated
Not-For-All-Audiences
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -61,8 +61,8 @@ The source code can also be used directly.
|
|
61 |
ABOUT:
|
62 |
|
63 |
A stranger, yet radically different version of Kalmaze's "Qwen/Qwen3-16B-A3B" (that was abliterated by "huihui-ai") with the
|
64 |
-
experts pruned to 64 (from 128, the Qwen 3 30B-A3B version) and then I added 19 layers expanding (Brainstorm 20x by DavidAU
|
65 |
-
the model to 23B total parameters.
|
66 |
|
67 |
The goal: slightly alter the model, to address some odd creative thinking and output choices.
|
68 |
|
@@ -921,3 +921,63 @@ It was….
|
|
921 |
—and the silence returns.
|
922 |
|
923 |
And the dream continues.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
ABOUT:
|
62 |
|
63 |
A stranger, yet radically different version of Kalmaze's "Qwen/Qwen3-16B-A3B" (that was abliterated by "huihui-ai") with the
|
64 |
+
experts pruned to 64 (from 128, the Qwen 3 30B-A3B version) and then I added 19 layers expanding (Brainstorm 20x by DavidAU - info at the
|
65 |
+
bottom of this page) the model to 23B total parameters.
|
66 |
|
67 |
The goal: slightly alter the model, to address some odd creative thinking and output choices.
|
68 |
|
|
|
921 |
—and the silence returns.
|
922 |
|
923 |
And the dream continues.
|
924 |
+
|
925 |
+
---
|
926 |
+
|
927 |
+
<H2>What is Brainstorm?</H2>
|
928 |
+
|
929 |
+
---
|
930 |
+
|
931 |
+
<B>Brainstorm 20x</B>
|
932 |
+
|
933 |
+
The BRAINSTORM process was developed by David_AU.
|
934 |
+
|
935 |
+
Some of the core principals behind this process are discussed in this <a href="https://arxiv.org/pdf/2401.02415">
|
936 |
+
scientific paper : Progressive LLaMA with Block Expansion </a>.
|
937 |
+
|
938 |
+
However I went in a completely different direction from what was outlined in this paper.
|
939 |
+
|
940 |
+
What is "Brainstorm" ?
|
941 |
+
|
942 |
+
The reasoning center of an LLM is taken apart, reassembled, and expanded.
|
943 |
+
|
944 |
+
In this case for this model: 20 times
|
945 |
+
|
946 |
+
Then these centers are individually calibrated. These "centers" also interact with each other.
|
947 |
+
This introduces subtle changes into the reasoning process.
|
948 |
+
The calibrations further adjust - dial up or down - these "changes" further.
|
949 |
+
The number of centers (5x,10x etc) allow more "tuning points" to further customize how the model reasons so to speak.
|
950 |
+
|
951 |
+
The core aim of this process is to increase the model's detail, concept and connection to the "world",
|
952 |
+
general concept connections, prose quality and prose length without affecting instruction following.
|
953 |
+
|
954 |
+
This will also enhance any creative use case(s) of any kind, including "brainstorming", creative art form(s) and like case uses.
|
955 |
+
|
956 |
+
Here are some of the enhancements this process brings to the model's performance:
|
957 |
+
|
958 |
+
- Prose generation seems more focused on the moment to moment.
|
959 |
+
- Sometimes there will be "preamble" and/or foreshadowing present.
|
960 |
+
- Fewer or no "cliches"
|
961 |
+
- Better overall prose and/or more complex / nuanced prose.
|
962 |
+
- A greater sense of nuance on all levels.
|
963 |
+
- Coherence is stronger.
|
964 |
+
- Description is more detailed, and connected closer to the content.
|
965 |
+
- Simile and Metaphors are stronger and better connected to the prose, story, and character.
|
966 |
+
- Sense of "there" / in the moment is enhanced.
|
967 |
+
- Details are more vivid, and there are more of them.
|
968 |
+
- Prose generation length can be long to extreme.
|
969 |
+
- Emotional engagement is stronger.
|
970 |
+
- The model will take FEWER liberties vs a normal model: It will follow directives more closely but will "guess" less.
|
971 |
+
- The MORE instructions and/or details you provide the more strongly the model will respond.
|
972 |
+
- Depending on the model "voice" may be more "human" vs original model's "voice".
|
973 |
+
|
974 |
+
Other "lab" observations:
|
975 |
+
|
976 |
+
- This process does not, in my opinion, make the model 5x or 10x "smarter" - if only that was true!
|
977 |
+
- However, a change in "IQ" was not an issue / a priority, and was not tested or calibrated for so to speak.
|
978 |
+
- From lab testing it seems to ponder, and consider more carefully roughly speaking.
|
979 |
+
- You could say this process sharpens the model's focus on it's task(s) at a deeper level.
|
980 |
+
|
981 |
+
The process to modify the model occurs at the root level - source files level. The model can quanted as a GGUF, EXL2, AWQ etc etc.
|
982 |
+
|
983 |
+
---
|