--- base_model: - Qwen/Qwen2.5-VL-72B-Instruct language: - en license: apache-2.0 tags: - transformers - multimodal pipeline_tag: visual-question-answering --- # INFRL-Qwen2.5-VL-72B-Preview ## Model Overview - **INFRL-Qwen2.5-VL-72B-Preview** improves visual reasoning upon [Qwen2.5-VL-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct) model. - As of March 25th, 2025, **INFRL-Qwen2.5-VL-72B-Preview** is the best-performing open-sourced VL model on various visual reasoning benchmarks ([MathVision](https://mathllm.github.io/mathvision/),[MathVista](https://mathvista.github.io/), [EMMA](https://emma-benchmark.github.io/#leaderboard), [MMMUPro](https://mmmu-benchmark.github.io/), [MathVerse](https://mathverse-cuhk.github.io/)). | Models | MathVision (test) | MathVista (testmini) | MathVerse (testmini) | |-------------------|-------------------|----------------------|----------------------| | GPT4o (R1-1V Rep) | 30.6 | 60 | 41.2 | | Gemini-2.0-Flash | 41.3 | 70.1 | 50.6 | | Claude 3.5 Sonnet | 33.5 | 67.7 | 47.8 | | QvQ-72B | 35.9 | 71.4 | 48.6 | | InternVL2.5-78B | 34.9 | 72.3 | 51.7 | | Qwen-VL-2.5-72B | 38.1 | 74.8 | 57.18 | | INFRL-VL-Preview | 41.9 | 77.8 | 58.84 | ## Evaluation We will release a code repository with vLLM support for VLM evaluation. - 10x faster than hf.generate(). Better efficiency in evaluating larger benchmark. - Efficient answer extraction and matching with aligned performance with Qwen2.5-VL. No need for costly LLM-Judge. Stay tuned! ## Contributors ### Supervisors Wei Chu • Yuan Qi ### VL Team Haozhe Wang • Zuming Huang ### RL Team Haozhe Wang • Chao Qu • Long Li ## Thanks Thanks to Jiaran Hao, Liuyihan Song for supports in the RL infrastructure. ## Citation If you find our model useful, please consider citing: ``` @misc {INFRL_VL_Preview, author = { {Wang, Haozhe and Huang, Zuming and Qu, Chao and Chu, Wei and Qi, Yuan} }, title = { INFRL-Qwen2.5-VL-72B-Preview }, year = 2025, url = { https://huggingface.co/infly/INFRL-Qwen2.5-VL-72B-Preview}, publisher = { Hugging Face } } ```