hamishivi commited on
Commit
a7beb67
·
verified ·
1 Parent(s): 25d5d68

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -154,7 +154,7 @@ See the Falcon 180B model card for an example of this.
154
  DPO:
155
  - **Learning Rate**: 5 × 10⁻⁷ (8B), 2.0e-7 (70B, 405B)
156
  - **Learning Rate Schedule**: Linear
157
- - **Batch Size (effective)**: 32 (8B), 128 (70B), 256(405B)
158
  - **KL Penalty Coefficient**: 5
159
  - **Warm-up Ratio**: 0.1
160
  - **Max Sequence Length**: 2,048
 
154
  DPO:
155
  - **Learning Rate**: 5 × 10⁻⁷ (8B), 2.0e-7 (70B, 405B)
156
  - **Learning Rate Schedule**: Linear
157
+ - **Batch Size (effective)**: 128 (8B), 128 (70B), 256(405B)
158
  - **KL Penalty Coefficient**: 5
159
  - **Warm-up Ratio**: 0.1
160
  - **Max Sequence Length**: 2,048