|
--- |
|
base_model: |
|
- showlab/show-o-w-clip-vit |
|
datasets: |
|
- brivangl/midjourney-v6-llava |
|
language: |
|
- en |
|
- zh |
|
license: apache-2.0 |
|
pipeline_tag: text-to-image |
|
library_name: diffusers |
|
--- |
|
|
|
# Show-o-RecA |
|
|
|
> A self-supervised training framework that aligns understanding and generation in modest compute, with huge **zero-shot** gain on generation and editing capability. |
|
|
|
This repository hosts the model weights for **Show-o-RecA**. For installation, usage instructions, and further documentation, please visit Show-o's original [GitHub repository](https://github.com/showlab/Show-o). |
|
|
|
## π§ Method |
|
|
|
[](https://arxiv.org/pdf/2509.07295) |
|
[](https://arxiv.org/abs/2509.07295) |
|
[](https://huggingface.co/papers/2509.07295) |
|
[](https://github.com/HorizonWind2004/reconstruction-alignment) |
|
[](https://huggingface.co/collections/sanaka87/realign-68ad2176380355a3dcedc068) |
|
[-fcd022?style=for-the-badge&logo=huggingface&logoColor=000)](https://huggingface.co/spaces/sanaka87/BAGEL-ReAlign) |
|
[](https://reconstruction-alignment.github.io/) |
|
|
|
|
|
## π Benchmarks |
|
|
|
| Model | GenEval β | DPGBench β | WISE β | |
|
| ------------ | --------- | --------- | --------- | |
|
| **Show-o** | 0.57 | 70.65 | 0.33 | |
|
| **Show-o-RecA** | **0.62** | **75.70** | **0.34** | |
|
|
|
## License |
|
|
|
Show-o-RecA is licensed under the Apache 2.0 license. |
|
|
|
## βοΈ Citation |
|
|
|
If you find our work inspiring or use our codebase in your research, please consider giving a star β and a citation~ |
|
|
|
@misc{xie2025reconstructionalignmentimprovesunified, |
|
title={Reconstruction Alignment Improves Unified Multimodal Models}, |
|
author={Ji Xie and Trevor Darrell and Luke Zettlemoyer and XuDong Wang}, |
|
year={2025}, |
|
eprint={2509.07295}, |
|
archivePrefix={arXiv}, |
|
primaryClass={cs.CV}, |
|
url={https://arxiv.org/abs/2509.07295}, |
|
} |