URSA-MATH
/

URSA-8B-PS-GRPO

Image-Text-to-Text

text2text-generation

Model card Files Files and versions Community

URSA-8B-PS-GRPO

URSA-8B-PS-GRPO employs process-supervision grpo which proposed in our paper.

Installation

from huggingface_hub import snapshot_download

repo_id = "URSA-MATH/URSA-8B-PS-GRPO"
local_dir = YOUR_LOCAL_PATH  

snapshot_path = snapshot_download(
    repo_id=repo_id,
    local_dir=local_dir,
    revision="main", 
    cache_dir=None, 
)

Inference

We have adapted vLLM for URSA-8B. Please refer to the GitHub repository for quick inference implementation.

Besides, we have adapted evaluation on VLMEvalKit!

Citation

If you find our paper, model, or data helpful, please give this repo a star 🌟 and cite our article ✏️.

@article{luo2025ursa,
  title={URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics},
  author={Luo, Ruilin and Zheng, Zhuofan and Wang, Yifan and Yu, Yiyao and Ni, Xinzhe and Lin, Zicheng and Zeng, Jin and Yang, Yujiu},
  journal={arXiv preprint arXiv:2501.04686},
  year={2025}
}

Downloads last month: 5

Safetensors

Model size

8.04B params

Tensor type

F32

·

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train URSA-MATH/URSA-8B-PS-GRPO