base_model:
- showlab/show-o-w-clip-vit
datasets:
- brivangl/midjourney-v6-llava
language:
- en
- zh
license: apache-2.0
pipeline_tag: text-to-image
library_name: diffusers
Show-o-RecA
A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.
This repository hosts the model weights for Show-o-RecA. For installation, usage instructions, and further documentation, please visit Show-o's original GitHub repository.
π§ Method
π Benchmarks
Model | GenEval β | DPGBench β | WISE β |
---|---|---|---|
Show-o | 0.57 | 70.65 | 0.33 |
Show-o-RecA | 0.62 | 75.70 | 0.34 |
License
Show-o-RecA is licensed under the Apache 2.0 license.
βοΈ Citation
If you find our work inspiring or use our codebase in your research, please consider giving a star β and a citation~
@misc{xie2025reconstructionalignmentimprovesunified, title={Reconstruction Alignment Improves Unified Multimodal Models}, author={Ji Xie and Trevor Darrell and Luke Zettlemoyer and XuDong Wang}, year={2025}, eprint={2509.07295}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.07295}, }