--- library_name: transformers tags: [] --- # Model Card for Model ID - Base model: Qwen/Qwen2.5-VL-3B-Instruct - Training: GRPO with leonardPKU/GEOQA_8K_R1V - Training log on wandb: https://wandb.ai/ddderek-hk-polyu/easy_r1/runs/d1xtspm0 - Total step of 70, not converged yet