Create README
Browse files
README
ADDED
@@ -0,0 +1,23 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Hyperparameters for GLUE:
|
2 |
+
- Learning rate: 5e-5
|
3 |
+
- Batch size: 64
|
4 |
+
- Max epochs: 10
|
5 |
+
- Patience: 10 (for CoLA, MRPC, RTE, BoolQ, MultiRC, and WSC), 100 (for MNLI, QQP, QNLI, and SST-2)
|
6 |
+
- Random seed: 12
|
7 |
+
- Weight decay: 0.1
|
8 |
+
- Warmup ratio: 0.1
|
9 |
+
- Learning rate scheduler: cosine
|
10 |
+
- Eval strategy: epoch (for CoLA, MRPC, RTE, BoolQ, MultiRC, and WSC), steps (for MNLI, QQP, QNLI, and SST-2)
|
11 |
+
- Eval every: 1 (for CoLA, MRPC, RTE, BoolQ, MultiRC, and WSC), 200 (for SST-2 and QNLI), 500 (for MNLI and QQP)
|
12 |
+
|
13 |
+
Hyperparameters for MSGS:
|
14 |
+
- Learning rate: 5e-5 (for CR, SC, RP, MV_RTP, and SC_LC), 1.5e-5 (for LC), 1e-5 (for SC_RP), 8e-6 (for MV_LC), 5e-6 (for MV), 5e-7 (CR_LC)
|
15 |
+
- Batch size: 32
|
16 |
+
- Max epochs: 10 (for CR, SC, RP, MV_RTP, SC_LC, SC_RP, MV, and CR_LC), 3 (for LC), 5 (for MV_LC)
|
17 |
+
- Patience: 10 (for CR, SC, RP, MV_RTP, SC_LC, SC_RP, MV, and CR_LC), 3 (for LC), 5 (for MV_LC)
|
18 |
+
- Random seed: 12
|
19 |
+
- Weight decay: 0.1
|
20 |
+
- Warmup ratio: 0.1
|
21 |
+
- Learning rate scheduler: cosine
|
22 |
+
- Eval strategy: epoch
|
23 |
+
- Eval every: 1
|