Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,9 @@ tags:
|
|
12 |
- merge
|
13 |
license: llama3.3
|
14 |
---
|
15 |
-
**
|
|
|
|
|
16 |
|
17 |
This marks the culmination of my experiments with the Progenitor series. I fixed the typo I had earlier where it wasn't computing in float32, but 6 models in computed in float32 is a bit taxing on resources and time and so I left it for the configuration I thought was the best (it's not something I can afford to do with every model I make, just the worthwhile ones). This one also uses the Sicari's tokenizer which I find the best.
|
18 |
# merge
|
|
|
12 |
- merge
|
13 |
license: llama3.3
|
14 |
---
|
15 |
+
**Upon further testing I found some logic issues!!**
|
16 |
+
|
17 |
+
//User y-ryan discovered an issue where the model had invalid tensor.Shape for weights ([1, 8192]), raising following errors when loading with transformers, and fixed it here: [tmfi-us/Progenitor-V5-Final-LLaMa-70B](https://huggingface.co/tmfi-us/Progenitor-V5-Final-LLaMa-70B) I have no clue what is the reason, but despite that I was still able to use and even quant this model?! Testing the fixed version and this gave me different outputs too, with this version's stuff being good?. If anyone understands this I would love to hear about it.//
|
18 |
|
19 |
This marks the culmination of my experiments with the Progenitor series. I fixed the typo I had earlier where it wasn't computing in float32, but 6 models in computed in float32 is a bit taxing on resources and time and so I left it for the configuration I thought was the best (it's not something I can afford to do with every model I make, just the worthwhile ones). This one also uses the Sicari's tokenizer which I find the best.
|
20 |
# merge
|