DavidAU commited on
Commit
6d7f874
·
verified ·
1 Parent(s): 94cce88

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -1
README.md CHANGED
@@ -52,7 +52,7 @@ The source code can also be used directly.
52
 
53
  ABOUT:
54
 
55
- Qwen's excellent "Qwen3-30B-A3B" with Brainstorm 20x in a MOE at 42B parameters.
56
 
57
  This pushes Qwen's model to the absolute limit for creative use cases.
58
 
@@ -321,3 +321,60 @@ EXAMPLE #4 - temp 1.2
321
 
322
  OUTPUT:
323
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ABOUT:
54
 
55
+ Qwen's excellent "Qwen3-30B-A3B" with Brainstorm 20x (tech notes at bottom of the page) in a MOE at 42B parameters.
56
 
57
  This pushes Qwen's model to the absolute limit for creative use cases.
58
 
 
321
 
322
  OUTPUT:
323
 
324
+ ---
325
+
326
+ <H2>What is Brainstorm?</H2>
327
+
328
+ <B>Brainstorm 20x</B>
329
+
330
+ The BRAINSTORM process was developed by David_AU.
331
+
332
+ Some of the core principals behind this process are discussed in this <a href="https://arxiv.org/pdf/2401.02415">
333
+ scientific paper : Progressive LLaMA with Block Expansion </a>.
334
+
335
+ However I went in a completely different direction from what was outlined in this paper.
336
+
337
+ What is "Brainstorm" ?
338
+
339
+ The reasoning center of an LLM is taken apart, reassembled, and expanded.
340
+
341
+ In this case for this model: 20 times
342
+
343
+ Then these centers are individually calibrated. These "centers" also interact with each other.
344
+ This introduces subtle changes into the reasoning process.
345
+ The calibrations further adjust - dial up or down - these "changes" further.
346
+ The number of centers (5x,10x etc) allow more "tuning points" to further customize how the model reasons so to speak.
347
+
348
+ The core aim of this process is to increase the model's detail, concept and connection to the "world",
349
+ general concept connections, prose quality and prose length without affecting instruction following.
350
+
351
+ This will also enhance any creative use case(s) of any kind, including "brainstorming", creative art form(s) and like case uses.
352
+
353
+ Here are some of the enhancements this process brings to the model's performance:
354
+
355
+ - Prose generation seems more focused on the moment to moment.
356
+ - Sometimes there will be "preamble" and/or foreshadowing present.
357
+ - Fewer or no "cliches"
358
+ - Better overall prose and/or more complex / nuanced prose.
359
+ - A greater sense of nuance on all levels.
360
+ - Coherence is stronger.
361
+ - Description is more detailed, and connected closer to the content.
362
+ - Simile and Metaphors are stronger and better connected to the prose, story, and character.
363
+ - Sense of "there" / in the moment is enhanced.
364
+ - Details are more vivid, and there are more of them.
365
+ - Prose generation length can be long to extreme.
366
+ - Emotional engagement is stronger.
367
+ - The model will take FEWER liberties vs a normal model: It will follow directives more closely but will "guess" less.
368
+ - The MORE instructions and/or details you provide the more strongly the model will respond.
369
+ - Depending on the model "voice" may be more "human" vs original model's "voice".
370
+
371
+ Other "lab" observations:
372
+
373
+ - This process does not, in my opinion, make the model 5x or 10x "smarter" - if only that was true!
374
+ - However, a change in "IQ" was not an issue / a priority, and was not tested or calibrated for so to speak.
375
+ - From lab testing it seems to ponder, and consider more carefully roughly speaking.
376
+ - You could say this process sharpens the model's focus on it's task(s) at a deeper level.
377
+
378
+ The process to modify the model occurs at the root level - source files level. The model can quanted as a GGUF, EXL2, AWQ etc etc.
379
+
380
+ ---