Text-to-Image
Diffusers
PyTorch
English
Chinese
Show-o-RecA / README.md
sanaka87's picture
Improve model card: Add Diffusers library, text-to-image pipeline tag, and HF paper link (#1)
fd21514 verified
metadata
base_model:
  - showlab/show-o-w-clip-vit
datasets:
  - brivangl/midjourney-v6-llava
language:
  - en
  - zh
license: apache-2.0
pipeline_tag: text-to-image
library_name: diffusers

Show-o-RecA

A self-supervised training framework that aligns understanding and generation in modest compute, with huge zero-shot gain on generation and editing capability.

This repository hosts the model weights for Show-o-RecA. For installation, usage instructions, and further documentation, please visit Show-o's original GitHub repository.

🧠 Method

Paper ArXiv Hugging Face Paper Github Hugging Face Collection HF Demo Project Page

πŸ“Š Benchmarks

Model GenEval ↑ DPGBench ↑ WISE ↑
Show-o 0.57 70.65 0.33
Show-o-RecA 0.62 75.70 0.34

License

Show-o-RecA is licensed under the Apache 2.0 license.

✍️ Citation

If you find our work inspiring or use our codebase in your research, please consider giving a star ⭐ and a citation~

@misc{xie2025reconstructionalignmentimprovesunified, title={Reconstruction Alignment Improves Unified Multimodal Models}, author={Ji Xie and Trevor Darrell and Luke Zettlemoyer and XuDong Wang}, year={2025}, eprint={2509.07295}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2509.07295}, }