Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- README.md +85 -3
- assets/teaser.webp +3 -0
- assets/uso.webp +0 -0
- config.json +4 -0
- uso_flux_v1.0/dit_lora.safetensors +3 -0
- uso_flux_v1.0/projector.safetensors +3 -0
.gitattributes
CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
+
assets/teaser.webp filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -1,3 +1,85 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
base_model:
|
6 |
+
- black-forest-labs/FLUX.1-dev
|
7 |
+
library_name: transformers
|
8 |
+
pipeline_tag: image-to-image
|
9 |
+
tags:
|
10 |
+
- image-generation
|
11 |
+
- subject-personalization
|
12 |
+
- style-transfer
|
13 |
+
- Diffusion-Transformer
|
14 |
+
---
|
15 |
+
|
16 |
+
<h3 align="center">
|
17 |
+
<img src="assets/uso.webp" alt="Logo" style="vertical-align: middle; width: 95px; height: auto;">
|
18 |
+
</br>
|
19 |
+
Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
|
20 |
+
</h3>
|
21 |
+
|
22 |
+
<p align="center">
|
23 |
+
<a href="https://github.com/bytedance/USO"><img alt="Build" src="https://img.shields.io/github/stars/bytedance/USO"></a>
|
24 |
+
<a href="https://bytedance.github.io/USO/"><img alt="Build" src="https://img.shields.io/badge/Project%20Page-USO-blue"></a>
|
25 |
+
<a href="https://arxiv.org/abs/2508.18966"><img alt="Build" src="https://img.shields.io/badge/Tech%20Report-USO-b31b1b.svg"></a>
|
26 |
+
<a href="https://huggingface.co/bytedance-research/USO"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20Hugging%20Face&message=Model&color=green"></a>
|
27 |
+
</p>
|
28 |
+
|
29 |
+

|
30 |
+
|
31 |
+
## 📖 Introduction
|
32 |
+
Existing literature typically treats style-driven and subject-driven generation as two disjoint tasks: the former prioritizes stylistic similarity, whereas the latter insists on subject consistency, resulting in an apparent antagonism. We argue that both objectives can be unified under a single framework because they ultimately concern the disentanglement and re-composition of “content” and “style”, a long-standing theme in style-driven research. To this end, we present USO, a Unified framework for Style driven and subject-driven GeneratiOn. First, we construct a large-scale triplet dataset consisting of content images, style images, and their corresponding stylized content images. Second, we introduce a disentangled learning scheme that simultaneously aligns style features and disentangles content from style through two complementary objectives, style-alignment training and content–style disentanglement training. Third, we incorporate a style reward-learning paradigm to further enhance the model’s performance.
|
33 |
+
|
34 |
+
## ⚡️ Quick Start
|
35 |
+
|
36 |
+
### 🔧 Requirements and Installation
|
37 |
+
|
38 |
+
Clone our [Github repo](https://github.com/bytedance/UNO)
|
39 |
+
|
40 |
+
|
41 |
+
Install the requirements
|
42 |
+
```bash
|
43 |
+
## create a virtual environment with python >= 3.10 <= 3.12, like
|
44 |
+
# python -m venv uso_env
|
45 |
+
# source uso_env/bin/activate
|
46 |
+
# then install
|
47 |
+
pip install -r requirements.txt
|
48 |
+
```
|
49 |
+
|
50 |
+
then download checkpoints in one of the three ways:
|
51 |
+
1. Directly run the inference scripts, the checkpoints will be downloaded automatically by the `hf_hub_download` function in the code to your `$HF_HOME`(the default value is `~/.cache/huggingface`).
|
52 |
+
2. use `huggingface-cli download <repo name>` to download `black-forest-labs/FLUX.1-dev`, `xlabs-ai/xflux_text_encoders`, `openai/clip-vit-large-patch14`, `TODO UNO hf model`, then run the inference scripts.
|
53 |
+
3. use `huggingface-cli download <repo name> --local-dir <LOCAL_DIR>` to download all the checkpoints menthioned in 2. to the directories your want. Then set the environment variable `TODO`. Finally, run the inference scripts.
|
54 |
+
|
55 |
+
### 🌟 Gradio Demo
|
56 |
+
|
57 |
+
```bash
|
58 |
+
python app.py
|
59 |
+
```
|
60 |
+
|
61 |
+
## 📄 Disclaimer
|
62 |
+
<p>
|
63 |
+
We open-source this project for academic research. The vast majority of images
|
64 |
+
used in this project are either generated or from open-source datasets. If you have any concerns,
|
65 |
+
please contact us, and we will promptly remove any inappropriate content.
|
66 |
+
Our project is released under the Apache 2.0 License. If you apply to other base models,
|
67 |
+
please ensure that you comply with the original licensing terms.
|
68 |
+
<br><br>This research aims to advance the field of generative AI. Users are free to
|
69 |
+
create images using this tool, provided they comply with local laws and exercise
|
70 |
+
responsible usage. The developers are not liable for any misuse of the tool by users.</p>
|
71 |
+
|
72 |
+
## Citation
|
73 |
+
We also appreciate it if you could give a star ⭐ to our [Github repository](https://github.com/bytedance/USO). Thanks a lot!
|
74 |
+
|
75 |
+
If you find this project useful for your research, please consider citing our paper:
|
76 |
+
```bibtex
|
77 |
+
@article{wu2025uso,
|
78 |
+
title={USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning},
|
79 |
+
author={Shaojin Wu and Mengqi Huang and Yufeng Cheng and Wenxu Wu and Jiahe Tian and Yiming Luo and Fei Ding and Qian He},
|
80 |
+
year={2025},
|
81 |
+
eprint={2508.18966},
|
82 |
+
archivePrefix={arXiv},
|
83 |
+
primaryClass={cs.CV},
|
84 |
+
}
|
85 |
+
```
|
assets/teaser.webp
ADDED
![]() |
Git LFS Details
|
assets/uso.webp
ADDED
![]() |
config.json
ADDED
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_diffusers_version": "0.30.1",
|
3 |
+
"_uso_flux_version": "1.0"
|
4 |
+
}
|
uso_flux_v1.0/dit_lora.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a03fa8430997f1c371c2471b133bdc03433a50564e0a29c096217077b0309e41
|
3 |
+
size 478187816
|
uso_flux_v1.0/projector.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9a0dfcd6644e3acaf6995625562ab0af1f9cf048bf739c7e5822ee106fb44311
|
3 |
+
size 21548200
|