--- datasets: - URSA-MATH/MMathCoT-1M language: - en - zh license: apache-2.0 library_name: transformers pipeline_tag: image-text-to-text --- # URSA-8B-PS-GRPO URSA-8B-PS-GRPO employs process-supervision grpo which proposed in our [paper](https://arxiv.org/pdf/2501.04686). # Installation ```python from huggingface_hub import snapshot_download repo_id = "URSA-MATH/URSA-8B-PS-GRPO" local_dir = YOUR_LOCAL_PATH snapshot_path = snapshot_download( repo_id=repo_id, local_dir=local_dir, revision="main", cache_dir=None, ) ``` # Inference We have adapted vLLM for URSA-8B. Please refer to the [GitHub](https://github.com/URSA-MATH/URSA-MATH) repository for quick inference implementation. Besides, we have adapted evaluation on [VLMEvalKit](https://github.com/open-compass/VLMEvalKit)! # Citation If you find our paper, model, or data helpful, please give this repo a star 🌟 and cite our article ✏️. ``` @article{luo2025ursa, title={URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics}, author={Luo, Ruilin and Zheng, Zhuofan and Wang, Yifan and Yu, Yiyao and Ni, Xinzhe and Lin, Zicheng and Zeng, Jin and Yang, Yujiu}, journal={arXiv preprint arXiv:2501.04686}, year={2025} } ``` ```