Qwen-Image Image Structure Control Model

Model Introduction

This model is a structure control model for images, trained based on Qwen-Image. The model architecture is ControlNet, capable of controlling the generated image structure according to edge detection (Canny) maps. The training framework is built upon DiffSynth-Studio and the dataset used is BLIP3oใ€‚

Effect Demonstration

Structure Map Generated Image 1 Generated Image 2

Inference Code

git clone https://github.com/modelscope/DiffSynth-Studio.git  
cd DiffSynth-Studio
pip install -e .
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig, ControlNetInput
from PIL import Image
import torch
from modelscope import dataset_snapshot_download


pipe = QwenImagePipeline.from_pretrained(
    torch_dtype=torch.bfloat16,
    device="cuda",
    model_configs=[
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"),
        ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
        ModelConfig(model_id="DiffSynth-Studio/Qwen-Image-Blockwise-ControlNet-Canny", origin_file_pattern="model.safetensors"),
    ],
    tokenizer_config=ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="tokenizer/"),
)

dataset_snapshot_download(
    dataset_id="DiffSynth-Studio/example_image_dataset",
    local_dir="./data/example_image_dataset",
    allow_file_pattern="canny/image_1.jpg"
)
controlnet_image = Image.open("data/example_image_dataset/canny/image_1.jpg").resize((1328, 1328))

prompt = "A puppy with shiny, smooth fur and lively eyes, with a spring garden full of cherry blossoms as the background, beautiful and warm."
image = pipe(
    prompt, seed=0,
    blockwise_controlnet_inputs=[ControlNetInput(image=controlnet_image)]
)
image.save("image.jpg")

license: apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
1.13B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for SahilCarterr/Qwen-Image-Blockwise-ControlNet-Canny

Base model

Qwen/Qwen-Image
Adapter
(58)
this model