Sao10K
/

70B-L3.3-Cirrus-x1

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Sao10K commited on Jan 7

Commit

ce3fa73

·

verified ·

1 Parent(s): 470d77a

Update README.md

Files changed (1) hide show

README.md +6 -5

README.md CHANGED Viewed

@@ -14,11 +14,12 @@ license: llama3.3
 # 70B-L3.3-Cirrus-x1
-I quite liked it, after messing around. Same data composition as Freya, applied differently.
-Has occasional brainfarts which are fixed with a regen, the price for more creative outputs.
-Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
 ```
 Prompt Format: Llama-3-Instruct
 Temperature: 1.1
@@ -27,7 +28,7 @@ min_p: 0.05
 ```
 Training time in total was ~22 Hours on a 8xH100 Node.
-Then, ~3 Hours spent merging checkpoints and model experimentation on a 2xH200 Node.
 ```
 https://sao10k.carrd.co/ for contact.

 # 70B-L3.3-Cirrus-x1
+\- Same data composition as Freya, applied differently, trained longer too.
+<br>\- Merging with its checkpoints was also involved.
+<br>\- Has a nice style, with occasional issues that can be easily fixed.
+<br>\- A more stable version compared to previous runs.
+My Model Settings | Feel free to use DRY or XTC or whatever meme samplers. I have zero experience with them, I can't help you there.
 ```
 Prompt Format: Llama-3-Instruct
 Temperature: 1.1
 ```
 Training time in total was ~22 Hours on a 8xH100 Node.
+Then, ~3 Hours spent merging multiple epoch checkpoints through dare_ties and model experimentation on a 2xH200 Node.
 ```
 https://sao10k.carrd.co/ for contact.