Mir-2002 commited on
Commit
3b0299d
·
verified ·
1 Parent(s): 9d5aa9c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -96,6 +96,8 @@ WEIGHT_DECAY = 0.01 <br>
96
  OPTIMIZER = ADAFACTOR <br>
97
  LR_SCHEDULER = LINEAR <br>
98
 
 
 
99
  # Loss
100
 
101
  On the 35th epoch, the model achieved the following loss:
 
96
  OPTIMIZER = ADAFACTOR <br>
97
  LR_SCHEDULER = LINEAR <br>
98
 
99
+ The model was trained on via Colab Pro, on an L4 GPU. A gradient accumulation step of 4 was used to simulate an effective batch size of 64 (16 * 4).
100
+
101
  # Loss
102
 
103
  On the 35th epoch, the model achieved the following loss: