IMAGHarmony: Controllable Image Editing with Consistent Object Quantity and Layout
ποΈ Release
- [2025/5/30] π₯ We released the technical report of IMAGHarmony.
- [2025/5/28] π₯ We release the train and inference code of IMAGHarmony.
- [2025/5/17] π We launch the project page of IMAGHarmony.
π‘ Introduction
IMAGHarmony tackles the challenge of controllable image editing in multi-object scenes, where existing models struggle with aligning object quantity and spatial layout. To this end, IMAGHarmony introduces a structure-aware framework for quantity-and-layout consistent image editing (QL-Edit), enabling precise control over object count, category, and arrangement. We propose a harmony-aware attention (HA) mechanism to jointly model object structure and semantics, and a preference-guided noise selection (PNS) strategy to stabilize generation by selecting semantically aligned initial noise. Our method is trained and evaluated on HarmonyBench, a newly curated benchmark with diverse editing scenarios.
π Download Models
You can download our models from Huggingface. You can download the other component models from the original repository, as follows.
Acknowledgement
We would like to thank the contributors to the Instantstyle and IP-Adapter repositories, for their open research and exploration.
The IMAGHarmony code is available for both academic and commercial use. Users are permitted to generate images using this tool, provided they comply with local laws and exercise responsible use. The developers disclaim all liability for any misuse or unlawful activity by users.
π¨ Contact
If you have any questions, please feel free to contact with us at [email protected] and [email protected].