URSA-8B-PS-GRPO
URSA-8B-PS-GRPO employs process-supervision grpo which proposed in our paper.
Installation
from huggingface_hub import snapshot_download
repo_id = "URSA-MATH/URSA-8B-PS-GRPO"
local_dir = YOUR_LOCAL_PATH
snapshot_path = snapshot_download(
repo_id=repo_id,
local_dir=local_dir,
revision="main",
cache_dir=None,
)
Inference
We have adapted vLLM for URSA-8B. Please refer to the GitHub repository for quick inference implementation.
Besides, we have adapted evaluation on VLMEvalKit!
Citation
If you find our paper, model, or data helpful, please give this repo a star ๐ and cite our article โ๏ธ.
@article{luo2025ursa,
title={URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics},
author={Luo, Ruilin and Zheng, Zhuofan and Wang, Yifan and Yu, Yiyao and Ni, Xinzhe and Lin, Zicheng and Zeng, Jin and Yang, Yujiu},
journal={arXiv preprint arXiv:2501.04686},
year={2025}
}
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support