---
datasets:
- URSA-MATH/MMathCoT-1M
language:
- en
- zh
license: apache-2.0
library_name: transformers
pipeline_tag: image-text-to-text
---


# URSA-8B-PS-GRPO

URSA-8B-PS-GRPO employs process-supervision grpo which proposed in our [paper](https://arxiv.org/pdf/2501.04686).

# Installation

```python
from huggingface_hub import snapshot_download

repo_id = "URSA-MATH/URSA-8B-PS-GRPO"
local_dir = YOUR_LOCAL_PATH  

snapshot_path = snapshot_download(
    repo_id=repo_id,
    local_dir=local_dir,
    revision="main", 
    cache_dir=None, 
)
```
# Inference
We have adapted vLLM for URSA-8B. Please refer to the [GitHub](https://github.com/URSA-MATH/URSA-MATH) repository for quick inference implementation.

Besides, we have adapted evaluation on [VLMEvalKit](https://github.com/open-compass/VLMEvalKit)!

# Citation

If you find our paper, model, or data helpful, please give this repo a star 🌟 and cite our article ✏️.
```
@article{luo2025ursa,
  title={URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics},
  author={Luo, Ruilin and Zheng, Zhuofan and Wang, Yifan and Yu, Yiyao and Ni, Xinzhe and Lin, Zicheng and Zeng, Jin and Yang, Yujiu},
  journal={arXiv preprint arXiv:2501.04686},
  year={2025}
}
```
```