Update README.md
Browse files
README.md
CHANGED
@@ -17,9 +17,10 @@ Mishima Imatrix Quants of new "Qwen 3 - 4B" model with MAX "output tensor" at BF
|
|
17 |
|
18 |
Mishima Imatrix dataset was generated using some of the public domain works of author (in English):
|
19 |
|
20 |
-
YUKIO MISHiMA (Japanese author and poet, 1925-1970)
|
21 |
|
22 |
-
This is an experiment to determine prose changes / changes to the model using a specific, but long and detailed Imatrix dataset
|
|
|
23 |
|
24 |
To test against "Qwen 3 4B" regular, "Horror" and "NEO" versions:
|
25 |
|
@@ -57,6 +58,8 @@ Each one is the same dataset, however dataset #1 is "raw" text format, whereas d
|
|
57 |
|
58 |
Both affect the model differently.
|
59 |
|
|
|
|
|
60 |
Context Length: 32 K + 8K output generation. (can be extended to 128k)
|
61 |
|
62 |
<B>NOTE - Jinja Template / Template to Use with this Model:</B>
|
|
|
17 |
|
18 |
Mishima Imatrix dataset was generated using some of the public domain works of author (in English):
|
19 |
|
20 |
+
YUKIO MISHiMA (Japanese author and poet, 1925-1970) ; entire work of "SUN AND STEEL" (HIS PERSONAL TESTAMENT ON ART, ACTION, AND RITUAL DEATH)
|
21 |
|
22 |
+
This is an experiment to determine prose changes / changes to the model using a specific, but long and detailed Imatrix dataset
|
23 |
+
on the newest Qwen 3 model type.
|
24 |
|
25 |
To test against "Qwen 3 4B" regular, "Horror" and "NEO" versions:
|
26 |
|
|
|
58 |
|
59 |
Both affect the model differently.
|
60 |
|
61 |
+
Also, each quant has the output tensor at BF16 (16 bit precision) to improve reasoning and output generation.
|
62 |
+
|
63 |
Context Length: 32 K + 8K output generation. (can be extended to 128k)
|
64 |
|
65 |
<B>NOTE - Jinja Template / Template to Use with this Model:</B>
|