|
--- |
|
license: mit |
|
library_name: transformers |
|
pipeline_tag: image-text-to-text |
|
--- |
|
|
|
<p align="center"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/623d8ca4c29adf5ef6175615/bp_DZR79-mTj8Z6GJe9B0.png" width="80%" /> |
|
</p> |
|
|
|
<font size=3><div align='center' > |
|
[[๐ arXiv Paper](https://arxiv.org/abs/2502.10391)] |
|
[[๐ MM-RLHF Data](https://huggingface.co/datasets/yifanzhang114/MM-RLHF)] |
|
[[๐ Homepage](https://mm-rlhf.github.io/)] |
|
[[๐ Reward Model](https://huggingface.co/yifanzhang114/MM-RLHF-Reward-7B-llava-ov-qwen)] |
|
|
|
[[๐ฎ MM-RewardBench](https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench)] |
|
[[๐ฎ MM-SafetyBench](https://github.com/yfzhang114/mmrlhf-eval)] |
|
[[๐ Evaluation Suite](https://github.com/yfzhang114/mmrlhf-eval)] |
|
</div></font> |
|
|
|
|
|
# The Next Step Forward in Multimodal LLM Alignment |
|
|
|
**[2025/02/10]** ๐ฅ We are proud to open-source **MM-RLHF**, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes: |
|
|
|
- A **high-quality MLLM alignment dataset**. |
|
- A **strong Critique-Based MLLM reward model** and its training algorithm. |
|
- A **novel alignment algorithm MM-DPO**. |
|
- **Two new benchmarks**. |
|
|
|
Our dataset and algorithms enable consistent performance improvements across **10 dimensions** and **27 benchmarks**.\n<p align="center"> |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/623d8ca4c29adf5ef6175615/8nVZQd8bfB6NJIixCv6_X.png" width="80%" /> |
|
</p> |
|
|
|
|
|
## Use |
|
|
|
### Intended use |
|
|
|
The model was trained on [MM-RLHF data](https://huggingface.co/datasets/yifanzhang114/MM-RLHF) and have the ability to interact with images, multi-image and videos. |
|
|
|
|
|
 |
|
|
|
**Feel free to share your generations in the Community tab!** |
|
|
|
### Generation |
|
|
|
We provide the simple generation process for using our model. For more details, you could refer to [Github](https://github.com/yfzhang114/MM-RLHF).\n |
|
## Citation |
|
|
|
If you find it useful for your research and applications, please cite related papers/blogs using this BibTeX: |
|
```bibtex |
|
|
|
``` |