CodeGoat24 commited on
Commit
ad279bc
Β·
verified Β·
1 Parent(s): d552a79

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -2
README.md CHANGED
@@ -11,7 +11,7 @@ base_model:
11
  This model is trained on LLaVA-OneVision based on DPO preference data constructed by our [UnifiedReward-7B](https://huggingface.co/CodeGoat24/UnifiedReward-7b) for enhanced image understanding ability.
12
 
13
  For further details, please refer to the following resources:
14
- - πŸ“° Paper:
15
  - πŸͺ Project Page: https://codegoat24.github.io/UnifiedReward/
16
  - πŸ€— Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
17
  - πŸ€— Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
@@ -76,5 +76,10 @@ print(text_outputs)
76
  ## Citation
77
 
78
  ```
79
-
 
 
 
 
 
80
  ```
 
11
  This model is trained on LLaVA-OneVision based on DPO preference data constructed by our [UnifiedReward-7B](https://huggingface.co/CodeGoat24/UnifiedReward-7b) for enhanced image understanding ability.
12
 
13
  For further details, please refer to the following resources:
14
+ - πŸ“° Paper: https://arxiv.org/pdf/2503.05236
15
  - πŸͺ Project Page: https://codegoat24.github.io/UnifiedReward/
16
  - πŸ€— Model Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-models-67c3008148c3a380d15ac63a
17
  - πŸ€— Dataset Collections: https://huggingface.co/collections/CodeGoat24/unifiedreward-training-data-67c300d4fd5eff00fa7f1ede
 
76
  ## Citation
77
 
78
  ```
79
+ @article{UnifiedReward,
80
+ title={Unified Reward Model for Multimodal Understanding and Generation.},
81
+ author={Wang, Yibin and Zang, Yuhang, and Li, Hao and Jin, Cheng and Wang Jiaqi},
82
+ journal={arXiv preprint arXiv:2503.05236},
83
+ year={2025}
84
+ }
85
  ```