OpenMMReasoner
/

OpenMMReasoner-RL

@@ -1,9 +1,13 @@
 ---
-datasets:
-- OpenMMReasoner/OpenMMReasoner-RL-74K
 base_model:
 - Qwen/Qwen2.5-VL-7B-Instruct
 ---
 # OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
 <div align="center">
@@ -81,4 +85,48 @@ output_text = processor.batch_decode(
     generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
 )
 print(output_text)
-```

 ---
 base_model:
 - Qwen/Qwen2.5-VL-7B-Instruct
+datasets:
+- OpenMMReasoner/OpenMMReasoner-RL-74K
+license: apache-2.0
+library_name: transformers
+pipeline_tag: image-text-to-text
 ---
 # OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
 <div align="center">
     generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False
 )
 print(output_text)
+```
+## Evaluation Results
+Our **OpenMMReasoner-7B (OMR-7B)** model demonstrates strong performance across a comprehensive suite of multimodal reasoning benchmarks. With only 874K SFT samples and 74K RL samples—significantly less data than many competing methods—our model achieves state-of-the-art or highly competitive results on 9 out of 14 benchmark tasks. Notably, OMR-7B achieves **79.5%** on MathVista testmini (best among all models), **63.8%** on MathVerse testmini (best), and **79.0%** on WeMath loose (best), demonstrating the effectiveness of our transparent two-stage training recipe. This performance validates our emphasis on data quality and rigorous training design over simply scaling dataset size.
+| Model | SFT Data | RL Data | MathVista<br/>testmini | MathVision<br/>test | MathVision<br/>testmini | MathVerse<br/>testmini | DynaMath<br/>worst | WeMath<br/>loose | LogicVista<br/>test | MMMU<br/>val | MMMU-Pro<br/>standard | MMMU-Pro<br/>vision | CharXiv<br/>reas. | CharXiv<br/>desc. |
+|-------|----------|---------|------------------------|---------------------|-------------------------|------------------------|--------------------|--------------------|---------------------|--------------|-----------------------|---------------------|-------------------|-------------------|
+| VLAA-Thinker-Qwen2.5-7B | 126k | 25k | 68.0 | 26.4 | - | 48.2 | 22.4 | - | 48.5 | - | - | - | - | - |
+| ThinkLite-7B-VL | - | 11k | 71.6 | 24.6 | - | 42.9 | 16.5 | - | 42.7 | - | - | - | - | - |
+| VL-Rethinker-7B | - | 39k | 73.7 | 28.4 | - | 46.4 | 17.8 | - | 42.7 | - | 41.7 | - | - | - |
+| M2-Reasoning | 6.2M | 102k | 75.0 | 42.1 | - | 40.4 | - | - | 50.6 | - | - | - | - | - |
+| MMR1 | 1.6M | 15k | 72.0 | 31.8 | 29.0† | 55.4 | 27.9† | 68.0† | 48.9 | 52.4† | 41.1† | 37.1† | 43.5† | 71.1† |
+| OpenVLThinker-7B | 3.3k | 9.6k | 65.3 | 23.0 | 26.9† | 38.1 | 16.8 | 61.9† | 44.5 | 55.1† | 39.7† | 38.4† | 41.0† | 69.2† |
+| MM-Eureka-Qwen-7B | - | 15.6k | 72.6 | 28.1 | 32.1† | 45.4 | 23.0 | 59.8† | 46.3 | 54.4† | 40.1† | 37.1† | 42.4† | 74.1† |
+| OVR-7B | 2M | 300k | 72.1 | **51.8** | 38.2† | 54.6 | 33.5 | 64.8 | **54.8** | 51.8† | **50.2** | 29.1† | 44.5 | 73.6 |
+| **OMR-7B (ours)** | **874k** | **74k** | **79.5** | 43.6 | **38.8** | **63.8** | **34.9** | **79.0** | 50.0 | **57.8** | 44.1 | **40.6** | **46.1** | 73.5 |
+**Note:** Bold numbers indicate the best performance, and † indicates results reproduced using the authors' checkpoints.
+## Citation
+If you find OpenMMReasoner useful for your research and applications, please cite using this BibTeX:
+```bibtex
+@misc{zhang2025openmmreasonerpushingfrontiersmultimodal,
+      title={OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe},
+      author={Kaichen Zhang and Keming Wu and Zuhao Yang and Kairui Hu and Bin Wang and Ziwei Liu and Xingxuan Li and Lidong Bing},
+      year={2025},
+      eprint={2511.16334},
+      archivePrefix={arXiv},
+      primaryClass={cs.AI},
+      url={https://arxiv.org/abs/2511.16334},
+}
+```
+## Acknowledgements
+We gratefully acknowledge the following open-source projects that made this work possible:
+- [**lmms-eval**](https://github.com/EvolvingLMMs-Lab/lmms-eval) for providing the comprehensive evaluation framework for large multimodal models.
+- [**lmms-engine**](https://github.com/EvolvingLMMs-Lab/lmms-engine) for the SFT training infrastructure and tools.
+- [**verl**](https://github.com/volcengine/verl) for the reinforcement learning training framework.
+We thank the developers and contributors of these projects for their excellent work and for making their code publicly available.