Update README.md
Browse files
README.md
CHANGED
@@ -96,6 +96,8 @@ WEIGHT_DECAY = 0.01 <br>
|
|
96 |
OPTIMIZER = ADAFACTOR <br>
|
97 |
LR_SCHEDULER = LINEAR <br>
|
98 |
|
|
|
|
|
99 |
# Loss
|
100 |
|
101 |
On the 35th epoch, the model achieved the following loss:
|
|
|
96 |
OPTIMIZER = ADAFACTOR <br>
|
97 |
LR_SCHEDULER = LINEAR <br>
|
98 |
|
99 |
+
The model was trained on via Colab Pro, on an L4 GPU. A gradient accumulation step of 4 was used to simulate an effective batch size of 64 (16 * 4).
|
100 |
+
|
101 |
# Loss
|
102 |
|
103 |
On the 35th epoch, the model achieved the following loss:
|