Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags: []
|
|
8 |
- Base model: Qwen/Qwen2.5-VL-3B-Instruct
|
9 |
- Training: GRPO with leonardPKU/GEOQA_8K_R1V
|
10 |
- Training log on wandb: https://wandb.ai/ddderek-hk-polyu/easy_r1/runs/d1xtspm0
|
11 |
-
-
|
12 |
|
13 |
|
14 |
|
|
|
8 |
- Base model: Qwen/Qwen2.5-VL-3B-Instruct
|
9 |
- Training: GRPO with leonardPKU/GEOQA_8K_R1V
|
10 |
- Training log on wandb: https://wandb.ai/ddderek-hk-polyu/easy_r1/runs/d1xtspm0
|
11 |
+
- Total step of 70, not converged yet
|
12 |
|
13 |
|
14 |
|