metadata

license: mit

[📖 arXiv Paper] [📊 MM-RLHF Data] [📝 Homepage] [🏆 Reward Model] [🔮 MM-RewardBench] [🔮 MM-SafetyBench] [📈 Evaluation Suite]

The Next Step Forward in Multimodal LLM Alignment

[2025/02/10] 🔥 We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes:

A high-quality MLLM alignment dataset.
A strong Critique-Based MLLM reward model and its training algorithm.
A novel alignment algorithm MM-DPO.
Two new benchmarks.

Our dataset and algorithms enable consistent performance improvements across 10 dimensions and 27 benchmarks for open-source MLLMs.

Citation

If you find it useful for your research and applications, please cite related papers/blogs using this BibTeX: