Tarek07 commited on
Commit
0383ccc
·
verified ·
1 Parent(s): 2364d82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -12,7 +12,9 @@ tags:
12
  - merge
13
  license: llama3.3
14
  ---
15
- **User y-ryan discovered an issue where the model had invalid tensor.Shape for weights ([1, 8192]), raising following errors when loading with transformers, and fixed it here: [tmfi-us/Progenitor-V5-Final-LLaMa-70B](https://huggingface.co/tmfi-us/Progenitor-V5-Final-LLaMa-70B) I have no clue what is the reason, but despite that I was still able to use and even quant this model?! Testing the fixed version and this gave me different outputs too, with this version's stuff being really good?. If anyone understands this I would love to hear about it.**
 
 
16
 
17
  This marks the culmination of my experiments with the Progenitor series. I fixed the typo I had earlier where it wasn't computing in float32, but 6 models in computed in float32 is a bit taxing on resources and time and so I left it for the configuration I thought was the best (it's not something I can afford to do with every model I make, just the worthwhile ones). This one also uses the Sicari's tokenizer which I find the best.
18
  # merge
 
12
  - merge
13
  license: llama3.3
14
  ---
15
+ **Upon further testing I found some logic issues!!**
16
+
17
+ //User y-ryan discovered an issue where the model had invalid tensor.Shape for weights ([1, 8192]), raising following errors when loading with transformers, and fixed it here: [tmfi-us/Progenitor-V5-Final-LLaMa-70B](https://huggingface.co/tmfi-us/Progenitor-V5-Final-LLaMa-70B) I have no clue what is the reason, but despite that I was still able to use and even quant this model?! Testing the fixed version and this gave me different outputs too, with this version's stuff being good?. If anyone understands this I would love to hear about it.//
18
 
19
  This marks the culmination of my experiments with the Progenitor series. I fixed the typo I had earlier where it wasn't computing in float32, but 6 models in computed in float32 is a bit taxing on resources and time and so I left it for the configuration I thought was the best (it's not something I can afford to do with every model I make, just the worthwhile ones). This one also uses the Sicari's tokenizer which I find the best.
20
  # merge