Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,39 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
|
5 |
+
<p align="center">
|
6 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/623d8ca4c29adf5ef6175615/bp_DZR79-mTj8Z6GJe9B0.png" width="80%" />
|
7 |
+
</p>
|
8 |
+
|
9 |
+
<font size=3><div align='center' >
|
10 |
+
[[๐ arXiv Paper](https://arxiv.org/abs/2406.08487)]
|
11 |
+
[[๐ MM-RLHF Data](https://huggingface.co/datasets/yifanzhang114/MM-RLHF)]
|
12 |
+
[[๐ Homepage](https://mm-rlhf.github.io/)]
|
13 |
+
[[๐ Reward Model](https://huggingface.co/yifanzhang114/MM-RLHF-Reward-7B-llava-ov-qwen)]
|
14 |
+
[[๐ฎ MM-RewardBench](https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench)]
|
15 |
+
[[๐ฎ MM-SafetyBench](https://github.com/yfzhang114/mmrlhf-eval)]
|
16 |
+
[[๐ Evaluation Suite](https://github.com/yfzhang114/mmrlhf-eval)]
|
17 |
+
</div></font>
|
18 |
+
|
19 |
+
|
20 |
+
# The Next Step Forward in Multimodal LLM Alignment
|
21 |
+
|
22 |
+
**[2025/02/10]** ๐ฅ We are proud to open-source **MM-RLHF**, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes:
|
23 |
+
|
24 |
+
- A **high-quality MLLM alignment dataset**.
|
25 |
+
- A **strong Critique-Based MLLM reward model** and its training algorithm.
|
26 |
+
- A **novel alignment algorithm MM-DPO**.
|
27 |
+
- **Two new benchmarks**.
|
28 |
+
|
29 |
+
Our dataset and algorithms enable consistent performance improvements across **10 dimensions** and **27 benchmarks** for open-source MLLMs.
|
30 |
+
<p align="center">
|
31 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/623d8ca4c29adf5ef6175615/8nVZQd8bfB6NJIixCv6_X.png" width="80%" />
|
32 |
+
</p>
|
33 |
+
|
34 |
+
## Citation
|
35 |
+
|
36 |
+
If you find it useful for your research and applications, please cite related papers/blogs using this BibTeX:
|
37 |
+
```bibtex
|
38 |
+
|
39 |
+
```
|