yifanzhang114
/

MM-RLHF-Reward-7B-llava-ov-qwen

Image-Text-to-Text

text-generation

Inference Endpoints

Model card Files Files and versions Community

MM-RLHF-Reward-7B-llava-ov-qwen / README.md

nielsr's picture

nielsr HF staff

Add pipeline tag and library name

35e2bdd verified about 1 month ago

|

2.19 kB

	---
	license: mit
	library_name: transformers
	pipeline_tag: image-text-to-text
	---

	<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/623d8ca4c29adf5ef6175615/bp_DZR79-mTj8Z6GJe9B0.png" width="80%" />
	</p>

	<font size=3><div align='center' >
	[[📖 arXiv Paper](https://arxiv.org/abs/2502.10391)]
	[[📊 MM-RLHF Data](https://huggingface.co/datasets/yifanzhang114/MM-RLHF)]
	[[📝 Homepage](https://mm-rlhf.github.io/)]
	[[🏆 Reward Model](https://huggingface.co/yifanzhang114/MM-RLHF-Reward-7B-llava-ov-qwen)]

	[[🔮 MM-RewardBench](https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench)]
	[[🔮 MM-SafetyBench](https://github.com/yfzhang114/mmrlhf-eval)]
	[[📈 Evaluation Suite](https://github.com/yfzhang114/mmrlhf-eval)]
	</div></font>


	# The Next Step Forward in Multimodal LLM Alignment

	[2025/02/10] 🔥 We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes:

	- A high-quality MLLM alignment dataset.
	- A strong Critique-Based MLLM reward model and its training algorithm.
	- A novel alignment algorithm MM-DPO.
	- Two new benchmarks.

	Our dataset and algorithms enable consistent performance improvements across 10 dimensions and 27 benchmarks.\n<p align="center">
	<img src="https://cdn-uploads.huggingface.co/production/uploads/623d8ca4c29adf5ef6175615/8nVZQd8bfB6NJIixCv6_X.png" width="80%" />
	</p>


	## Use

	### Intended use

	The model was trained on [MM-RLHF data](https://huggingface.co/datasets/yifanzhang114/MM-RLHF) and have the ability to interact with images, multi-image and videos.


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/623d8ca4c29adf5ef6175615/2RQJMhntIwE15y9lEtBfP.png)

	Feel free to share your generations in the Community tab!

	### Generation

	We provide the simple generation process for using our model. For more details, you could refer to [Github](https://github.com/yfzhang114/MM-RLHF).\n
	## Citation

	If you find it useful for your research and applications, please cite related papers/blogs using this BibTeX:
	```bibtex

	```