File size: 4,015 Bytes
0a11f58 eef8a74 b147eb1 eef8a74 0a11f58 f99f5eb b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 26c72b7 b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 07ad387 b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 a39e446 b147eb1 f99f5eb b147eb1 f99f5eb b147eb1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 |
---
license: openrail++
library_name: diffusers
tags:
- text-to-image
- stable-diffusion
---
# Conceptrol: Concept Control of Zero-shot Personalized Image Generation
## Model Card
This model implements Conceptrol, a training-free method that boosts zero-shot personalized image generation across Stable Diffusion, SDXL, and FLUX. It works without additional training, data, or models.
<p align="center">
<img src="demo/teaser.png">
</p>
[Conceptrol: Concept Control of Zero-shot Personalized Image Generation](https://huggingface.co/papers/2503.06568)
**Abstract:**
Personalized image generation with text-to-image diffusion models generates unseen images based on reference image content. Zero-shot adapter methods such as IP-Adapter and OminiControl are especially interesting because they do not require test-time fine-tuning. However, they struggle to balance preserving personalized content and adherence to the text prompt. We identify a critical design flaw resulting in this performance gap: current adapters inadequately integrate personalization images with the textual descriptions. The generated images, therefore, replicate the personalized content rather than adhere to the text prompt instructions. Yet the base text-to-image has strong conceptual understanding capabilities that can be leveraged.
We propose Conceptrol, a simple yet effective framework that enhances zero-shot adapters without adding computational overhead. Conceptrol constrains the attention of visual specification with a textual concept mask that improves subject-driven generation capabilities. It achieves as much as 89% improvement on personalization benchmarks over the vanilla IP-Adapter and can even outperform fine-tuning approaches such as Dreambooth LoRA.
## Quick Start
#### 1. Environment Setup
``` bash
conda create -n conceptrol python=3.10
conda activate conceptrol
pip install -r requirements.txt
```
#### 2. Go to `demo_sd.ipynb` / `demo_sdxl.ipynb` / `demo_flux.py` for fun!
## Local Setup using Gradio
#### 1. Start Gradio Interface
``` bash
pip install gradio
gradio gradio_src/app.py
```
#### 2. Use the GUI!
## Supporting Models
| Model Name | Link |
|-----------------------|-------------------------------------------------------------|
| Stable Diffusion 1.5 | [stable-diffusion-v1-5/stable-diffusion-v1-5](https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5) |
| Realistic Vision V5.1 | [SG161222/Realistic_Vision_V5.1_noVAE](https://huggingface.co/SG161222/Realistic_Vision_V5.1_noVAE) |
| Stable Diffusion XL-1024 | [stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) |
| Animagine XL v4.0 | [cagliostrolab/animagine-xl-4.0](https://huggingface.co/cagliostrolab/animagine-xl-4.0)|
| Realistic Vision XL V5.0 | [SG161222/RealVisXL_V5.0](https://huggingface.co/SG161222/RealVisXL_V5.0) |
| FLUX-schnell | [black-forest-labs/FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) |
| Adapter Name | Link |
|-----------------------|-------------------------------------------------------------|
| IP-Adapter | [h94/IP-Adapter](https://huggingface.co/h94/IP-Adapter/tree/main) |
| OminiControl | [Yuanshi/OminiControl](https://huggingface.co/Yuanshi/OminiControl) |
## Source Code
https://github.com/QY-H00/Conceptrol
## Citation
``` bibtex
@article{he2025conceptrol,
title={Conceptrol: Concept Control of Zero-shot Personalized Image Generation},
author={Qiyuan He and Angela Yao},
journal={arXiv preprint arXiv:2503.06568},
year={2025}
}
```
## Acknowledgement
We thank the following repositories for their great work:
[diffusers](https://github.com/huggingface/diffusers),
[transformers](https://github.com/huggingface/transformers),
[IP-Adapter](https://github.com/tencent-ailab/IP-Adapter),
[OminiControl](https://github.com/Yuanshi9815/OminiControl) |