Commit
·
b106433
1
Parent(s):
af0b0ff
Update README.md
Browse files
README.md
CHANGED
@@ -38,7 +38,8 @@ At 2048 tokens context length, the training set was around 2M (2,008,858) sample
|
|
38 |
|
39 |
## Training procedure
|
40 |
|
41 |
-
Trained with LoRA in 4 bit and merged before upload.
|
|
|
42 |
|
43 |
### Training hyperparameters
|
44 |
|
|
|
38 |
|
39 |
## Training procedure
|
40 |
|
41 |
+
Trained with LoRA targetting `['query_key_value', 'dense', 'dense_h_to_4h', 'dense_4h_to_h']` in 4 bit and merged before upload.
|
42 |
+
The adapters are in the `adapters` branch.
|
43 |
|
44 |
### Training hyperparameters
|
45 |
|