Update README.md
Browse files
README.md
CHANGED
@@ -14,11 +14,12 @@ license: llama3.3
|
|
14 |
|
15 |
# 70B-L3.3-Cirrus-x1
|
16 |
|
17 |
-
|
|
|
|
|
|
|
18 |
|
19 |
-
|
20 |
-
|
21 |
-
Recommended Model Settings | *Look, I just use these, they work fine enough. I don't even know how DRY or other meme samplers work. Your system prompt matters more anyway.*
|
22 |
```
|
23 |
Prompt Format: Llama-3-Instruct
|
24 |
Temperature: 1.1
|
@@ -27,7 +28,7 @@ min_p: 0.05
|
|
27 |
|
28 |
```
|
29 |
Training time in total was ~22 Hours on a 8xH100 Node.
|
30 |
-
Then, ~3 Hours spent merging checkpoints and model experimentation on a 2xH200 Node.
|
31 |
```
|
32 |
|
33 |
https://sao10k.carrd.co/ for contact.
|
|
|
14 |
|
15 |
# 70B-L3.3-Cirrus-x1
|
16 |
|
17 |
+
\- Same data composition as Freya, applied differently, trained longer too.
|
18 |
+
<br>\- Merging with its checkpoints was also involved.
|
19 |
+
<br>\- Has a nice style, with occasional issues that can be easily fixed.
|
20 |
+
<br>\- A more stable version compared to previous runs.
|
21 |
|
22 |
+
My Model Settings | Feel free to use DRY or XTC or whatever meme samplers. I have zero experience with them, I can't help you there.
|
|
|
|
|
23 |
```
|
24 |
Prompt Format: Llama-3-Instruct
|
25 |
Temperature: 1.1
|
|
|
28 |
|
29 |
```
|
30 |
Training time in total was ~22 Hours on a 8xH100 Node.
|
31 |
+
Then, ~3 Hours spent merging multiple epoch checkpoints through dare_ties and model experimentation on a 2xH200 Node.
|
32 |
```
|
33 |
|
34 |
https://sao10k.carrd.co/ for contact.
|