Image-to-Image
Diffusers
English
art
RiverZ commited on
Commit
ee5c12b
·
verified ·
1 Parent(s): 15d1ff0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -5
README.md CHANGED
@@ -1,5 +1,74 @@
1
- ---
2
- license: other
3
- license_name: no-commercial
4
- license_link: https://github.com/River-Zhang/ICEdit/blob/main/LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: no-commercial
4
+ license_link: https://github.com/River-Zhang/ICEdit/blob/main/LICENSE
5
+ datasets:
6
+ - osunlp/MagicBrush
7
+ - TIGER-Lab/OmniEdit-Filtered-1.2M
8
+ language:
9
+ - en
10
+ base_model:
11
+ - black-forest-labs/FLUX.1-Fill-dev
12
+ pipeline_tag: image-to-image
13
+ library_name: diffusers
14
+ tags:
15
+ - art
16
+ ---
17
+ <div align="center">
18
+
19
+ <h1>In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer</h1>
20
+
21
+ <div>
22
+ <a href="https://river-zhang.github.io/zechuanzhang//" target="_blank">Zechuan Zhang</a>&emsp;
23
+ <a href="https://horizonwind2004.github.io/" target="_blank">Ji Xie</a>&emsp;
24
+ <a href="https://yulu.net.cn/" target="_blank">Yu Lu</a>&emsp;
25
+ <a href="https://z-x-yang.github.io/" target="_blank">Zongxin Yang</a>&emsp;
26
+ <a href="https://scholar.google.com/citations?user=RMSuNFwAAAAJ&hl=zh-CN&oi=ao" target="_blank">Yi Yang✉</a>&emsp;
27
+ </div>
28
+ <div>
29
+ ReLER, CCAI, Zhejiang University; Harvard University
30
+ </div>
31
+ <div>
32
+ <sup>✉</sup>Corresponding Author
33
+ </div>
34
+ <div>
35
+ <a href="https://arxiv.org/abs/2504.20690" target="_blank">Arxiv</a>&emsp;
36
+ <a href="https://github.com/River-Zhang/ICEdit?tab=readme-ov-file" target="_blank">Github</a>&emsp;
37
+ <a href="https://huggingface.co/spaces/RiverZ/ICEdit" target="_blank">Huggingface Demo 🤗</a>&emsp;
38
+ <a href="https://river-zhang.github.io/ICEdit-gh-pages/" target="_blank">Project Page</a>
39
+ </div>
40
+
41
+
42
+ <div style="width: 80%; margin:auto;">
43
+ <img style="width:100%; display: block; margin: auto;" src="docs/images/teaser.png">
44
+ <p style="text-align: left;">We present In-Context Edit, a novel approach that achieves state-of-the-art instruction-based editing <b>using just 0.5% of the training data and 1% of the parameters required by prior SOTA methods</b>. The first row illustrates a series of multi-turn edits, executed with high precision, while the second and third rows highlight diverse, visually impressive single-turn editing results from our method.</p>
45
+ </div>
46
+
47
+ :open_book: For more visual results, go checkout our <a href="https://river-zhang.github.io/ICEdit-gh-pages/" target="_blank">project page</a>
48
+
49
+ This repository will contain the official implementation of _ICEdit_.
50
+
51
+
52
+ <div align="left">
53
+
54
+ # ⚠️ Tips
55
+
56
+ ### If you encounter such a failure case, please **try again with a different seed**!
57
+
58
+ - Our base model, FLUX, does not inherently support a wide range of styles, so a large portion of our dataset involves style transfer. As a result, the model **may sometimes inexplicably change your artistic style**.
59
+
60
+ - Our training dataset is **mostly targeted at realistic images**. For non-realistic images, such as **anime** or **blurry pictures**, the success rate of the editing **drop and could potentially affect the final image quality**.
61
+
62
+ - While the success rates for adding objects, modifying color attributes, applying style transfer, and changing backgrounds are high, the success rate for object removal is relatively lower due to the low quality of the OmniEdit removal dataset.
63
+
64
+ The current model is the one used in the experiments in the paper, trained with only 4 A800 GPUs (total `batch_size` = 2 x 2 x 4 = 16). In the future, we will enhance the dataset, and do scale-up, finally release a more powerful model.
65
+
66
+ # To Do List
67
+
68
+ - [x] Inference Code
69
+ - [ ] Inference-time Scaling with VLM
70
+ - [x] Pretrained Weights
71
+ - [ ] More Inference Demos
72
+ - [x] Gradio demo
73
+ - [ ] Comfy UI demo
74
+ - [ ] Training Code