Sao10K commited on
Commit
dc16042
·
verified ·
1 Parent(s): 586c735

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -13,11 +13,15 @@ Fimbulvetr-v2 but extended to 16K with PoSE. A sane context value would be ~12K
13
  Note:
14
  <br> \- I left Rope Theta at 10K for this train, instead of expanding it like with Stheno 3.3. Solar did not play will with extended theta, grad norm / loss values went parabolic or plunged from 10000+ down. Unreliable pretty much, unlike Stheno 3.3's training run.
15
 
 
 
16
  Notes:
17
- <br> \- I noticed peoplle having bad issues with quants. Be it GGUF or others, at 8 bit or less. Kind of a weird issue? I had little to no issues during testing at the full precision
18
  <br> \- Slightly different results from base Fimbulvetr-v2, but during my tests they are similar enough. The vibes are still there.
19
  <br> \- Formatting issues happen rarely. Sometimes. A reroll / regenerate fixes it from tests.
20
  <br> \- I get consistent and reliable answers at ~11K context fine.
21
  <br> \- Still coherent at up to 16K though! Just works not that well.
22
 
 
 
23
  ![Needle](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K/resolve/main/output.png)
 
13
  Note:
14
  <br> \- I left Rope Theta at 10K for this train, instead of expanding it like with Stheno 3.3. Solar did not play will with extended theta, grad norm / loss values went parabolic or plunged from 10000+ down. Unreliable pretty much, unlike Stheno 3.3's training run.
15
 
16
+ ---
17
+
18
  Notes:
19
+ <br> \- I noticed people having bad issues with quants. Be it GGUF or others, at 8 bit or less. Kind of a weird issue? I had little to no issues during testing at the full precision
20
  <br> \- Slightly different results from base Fimbulvetr-v2, but during my tests they are similar enough. The vibes are still there.
21
  <br> \- Formatting issues happen rarely. Sometimes. A reroll / regenerate fixes it from tests.
22
  <br> \- I get consistent and reliable answers at ~11K context fine.
23
  <br> \- Still coherent at up to 16K though! Just works not that well.
24
 
25
+ I recommend sticking up to 12K context, but loading the model at 16K. It has a really accurate context up to 10K from extended long context testing. 16K works fine for roleplays, but not for more detailed tasks.
26
+
27
  ![Needle](https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K/resolve/main/output.png)