Spaces:

Mer-o
/

Pose-Preserving-Comicfier

Running

App Files Files Community

Mer-o commited on May 4

Commit

4940090

2 Parent(s): f7ac47c 536503c

merging to main

Browse files

Files changed (17) hide show

.gitattributes +2 -35
.gitignore +127 -0
README.md +111 -14
app.py +272 -0
examples/example1.jpg +3 -0
examples/example2.jpg +3 -0
examples/example3.jpg +3 -0
examples/example4.jpg +3 -0
examples/example5.jpg +3 -0
examples/example6.jpg +3 -0
image_utils.py +134 -0
loras/add_detail.safetensors +3 -0
loras/night_comic_V06.safetensors +3 -0
model_loader.py +133 -0
pipelines.py +433 -0
prompts.py +53 -0
requirements.txt +22 -0

.gitattributes CHANGED Viewed

@@ -1,35 +1,2 @@
-*.7z filter=lfs diff=lfs merge=lfs -text
-*.arrow filter=lfs diff=lfs merge=lfs -text
-*.bin filter=lfs diff=lfs merge=lfs -text
-*.bz2 filter=lfs diff=lfs merge=lfs -text
-*.ckpt filter=lfs diff=lfs merge=lfs -text
-*.ftz filter=lfs diff=lfs merge=lfs -text
-*.gz filter=lfs diff=lfs merge=lfs -text
-*.h5 filter=lfs diff=lfs merge=lfs -text
-*.joblib filter=lfs diff=lfs merge=lfs -text
-*.lfs.* filter=lfs diff=lfs merge=lfs -text
-*.mlmodel filter=lfs diff=lfs merge=lfs -text
-*.model filter=lfs diff=lfs merge=lfs -text
-*.msgpack filter=lfs diff=lfs merge=lfs -text
-*.npy filter=lfs diff=lfs merge=lfs -text
-*.npz filter=lfs diff=lfs merge=lfs -text
-*.onnx filter=lfs diff=lfs merge=lfs -text
-*.ot filter=lfs diff=lfs merge=lfs -text
-*.parquet filter=lfs diff=lfs merge=lfs -text
-*.pb filter=lfs diff=lfs merge=lfs -text
-*.pickle filter=lfs diff=lfs merge=lfs -text
-*.pkl filter=lfs diff=lfs merge=lfs -text
-*.pt filter=lfs diff=lfs merge=lfs -text
-*.pth filter=lfs diff=lfs merge=lfs -text
-*.rar filter=lfs diff=lfs merge=lfs -text
-*.safetensors filter=lfs diff=lfs merge=lfs -text
-saved_model/**/* filter=lfs diff=lfs merge=lfs -text
-*.tar.* filter=lfs diff=lfs merge=lfs -text
-*.tar filter=lfs diff=lfs merge=lfs -text
-*.tflite filter=lfs diff=lfs merge=lfs -text
-*.tgz filter=lfs diff=lfs merge=lfs -text
-*.wasm filter=lfs diff=lfs merge=lfs -text
-*.xz filter=lfs diff=lfs merge=lfs -text
-*.zip filter=lfs diff=lfs merge=lfs -text
-*.zst filter=lfs diff=lfs merge=lfs -text
-*tfevents* filter=lfs diff=lfs merge=lfs -text


1	+ loras/*.safetensors filter=lfs diff=lfs merge=lfs -text
2	+ examples/*.jpg filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,127 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# PEP 582; used by PDM, PEP 582 compatible tooling
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# OS generated files #
+######################
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db

README.md CHANGED Viewed

@@ -1,14 +1,111 @@
----
-title: Pose Preserving Comicfier
-emoji: 📊
-colorFrom: green
-colorTo: green
-sdk: gradio
-sdk_version: 5.29.0
-app_file: app.py
-pinned: false
-license: mit
-short_description: 'Comicfier: Transforms photos into retro Western comic style'
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Pose-Preserving Comicfier - Gradio App
+[![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/Mer-o)(https://huggingface.co/spaces/Mer-o/Pose-Preserving-Comicfier)
+This repository contains the code for a Gradio web application that transforms input images into a specific retro Western comic book style while preserving the original pose. It uses Stable Diffusion v1.5, ControlNet (OpenPose + Tile), and specific LoRAs.
+This application refactors the workflow initially developed in a [Kaggle Notebook](https://github.com/mehran-khani/SD-Controlnet-Comic-Styler) into a deployable web app.
+## Features
+*   **Pose Preservation:** Uses ControlNet OpenPose to accurately maintain the pose from the input image.
+*   **Retro Comic Style Transfer:** Applies specific LoRAs (`night_comic_V06.safetensors` & `add_detail.safetensors`) for a 1940s Western comic aesthetic with enhanced details.
+*   **Tiling Upscaling:** Implements ControlNet Tile for 2x high-resolution output (1024x1024), improving detail consistency over large images.
+*   **Simplified UI:** Easy-to-use interface with only an image upload and generate button.
+*   **Fixed Parameters:** Generation uses pre-defined, optimized parameters (steps, guidance, strength, prompts) based on the original notebook implementation for consistent results.
+*   **Dynamic Backgrounds:** The background elements in the generated image are randomized for variety in the low-resolution stage.
+*   **Broad Image Support:** Accepts common formats like JPG, PNG, WEBP, and HEIC (requires `pillow-heif`).
+## Technology Stack
+*   **Python 3**
+*   **Gradio:** Web UI framework.
+*   **PyTorch:** Core ML framework.
+*   **Hugging Face Libraries:**
+    *   `diffusers`: Stable Diffusion pipelines, ControlNet integration.
+    *   `transformers`: Underlying model components.
+    *   `accelerate`: Hardware acceleration utilities.
+    *   `peft`: LoRA loading and management.
+*   **ControlNet:**
+    *   OpenPose Detector (`controlnet_aux`)
+    *   OpenPose ControlNet Model (`lllyasviel/sd-controlnet-openpose`)
+    *   Tile ControlNet Model (`lllyasviel/control_v11f1e_sd15_tile`)
+*   **Base Model:** `runwayml/stable-diffusion-v1-5`
+*   **LoRAs Used:**
+    *   Style: [Western Comics Style](https://civitai.com/models/1081588/western-comics-style) (using `night_comic_V06.safetensors`)
+    *   Detail: [Detail Tweaker LoRA](https://civitai.com/models/58390/detail-tweaker-lora-lora) (using `add_detail.safetensors`)
+*   **Image Processing:** `Pillow`, `pillow-heif`, `numpy`, `opencv-python-headless`
+*   **Dependencies:** `matplotlib`, `mediapipe` (required by `controlnet_aux`)
+## Workflow Overview
+1.  **Image Preparation (`image_utils.py`):** Input image is loaded (supports HEIC), converted to RGB, EXIF data handled, and force-resized to 512x512.
+2.  **Pose Detection (`pipelines.py`):** An OpenPose map is extracted from the resized image using `controlnet_aux`.
+3.  **Low-Resolution Generation (`pipelines.py`):**
+    *   An SDv1.5 Img2Img pipeline with Pose ControlNet is dynamically loaded.
+    *   Prompts are generated (`prompts.py`) with a fixed base/style and a *randomized* background element.
+    *   Style and Detail LoRAs are applied.
+    *   A 512x512 image is generated using fixed parameters.
+    *   The pipeline is unloaded to conserve VRAM.
+4.  **High-Resolution Tiling (`pipelines.py`):**
+    *   The 512x512 image is upscaled 2x (to 1024x1024) using bicubic interpolation (creating a blurry base).
+    *   An SDv1.5 Img2Img pipeline with Tile ControlNet is dynamically loaded.
+    *   Tile-specific prompts (excluding the random background) are used.
+    *   Style and Detail LoRAs are applied (potentially with different weights).
+    *   The image is processed in overlapping 1024x1024 tiles.
+    *   Processed tiles are blended back together using an alpha mask (`image_utils.py`).
+    *   The pipeline is unloaded.
+5.  **Output (`app.py`):** The final 1024x1024 image is displayed in the Gradio UI.
+## How to Run Locally
+*(Requires sufficient RAM/CPU or compatible GPU, Python 3.8+, and Git)*
+1.  **Clone the repository:**
+    ```bash
+    git clone https://github.com/mehran-khani/Pose-Preserving-Comicfier.git
+    cd Pose-Preserving-Comicfier
+    ```
+2.  **Create and activate a Python virtual environment:**
+    ```bash
+    python3 -m venv .venv
+    source .venv/bin/activate
+    # .\.venv\Scripts\Activate.ps1
+    # .\.venv\Scripts\activate.bat
+    ```
+3.  **Install dependencies:**
+    ```bash
+    pip install -r requirements.txt
+    ```
+    *(Note: PyTorch installation might require specific commands depending on your OS/CUDA setup if using a local GPU. See PyTorch website.)*
+4.  **Download LoRA files:**
+    *   Create a folder named `loras` in the project root.
+    *   Download `night_comic_V06.safetensors` (from Civitai link above) and place it in the `loras` folder.
+    *   Download `add_detail.safetensors` (from Civitai link above) and place it in the `loras` folder.
+5.  **Run the Gradio app:**
+    ```bash
+    python app.py
+    ```
+6.  Open the local URL provided (e.g., `http://127.0.0.1:7860`) in your browser. *(Note: Execution will be very slow without a suitable GPU).*
+## Deployment to Hugging Face Spaces
+This app is designed for deployment on Hugging Face Spaces, ideally with GPU hardware.
+1.  Ensure all code (`*.py`), `requirements.txt`, `.gitignore`, and the `loras` folder (containing the `.safetensors` files) are committed and pushed to this GitHub repository.
+2.  Create a new Space on Hugging Face ([huggingface.co/new-space](https://huggingface.co/new-space)).
+3.  Choose an owner, Space name, and select "Gradio" as the Space SDK.
+4.  Select desired hardware (e.g., "T4 small" under GPU options). Note compute costs may apply.
+5.  Choose "Use existing GitHub repository".
+6.  Enter the URL of this GitHub repository.
+7.  Click "Create Space". The Space will build the environment from `requirements.txt` and run `app.py`. Monitor the build and runtime logs for any issues.
+## Limitations
+*   **Speed:** Generation requires significant time (minutes), especially on shared/free GPU hardware, due to the multi-stage process and dynamic model loading between stages. CPU execution is impractically slow.
+*   **VRAM:** While optimized with dynamic pipeline unloading, the process still requires considerable GPU VRAM (>10GB peak). Out-of-memory errors are possible on lower-VRAM GPUs.
+*   **Fixed Style:** The artistic style (prompts, LoRAs, parameters) is fixed in the code to replicate the notebook's specific output and cannot be changed via the UI.
+## License
+MIT License

app.py ADDED Viewed

	@@ -0,0 +1,272 @@

+"""
+Main application script for the Gradio interface.
+This script initializes the application, loads prerequisite models via model_loader,
+defines the user interface using Gradio Blocks, and orchestrates the multi-stage
+image generation process by calling functions from the pipelines module.
+"""
+import gradio as gr
+import gradio.themes as gr_themes
+import time
+import os
+import random
+# --- Imports from our custom modules ---
+try:
+    from image_utils import prepare_image
+    from model_loader import load_models, are_models_loaded
+    from pipelines import run_pose_detection, run_low_res_generation, run_hires_tiling, cleanup_memory
+    print("Helper modules imported successfully.")
+except ImportError as e:
+    print(f"ERROR: Failed to import required local modules: {e}")
+    print("Please ensure prompts.py, image_utils.py, model_loader.py, and pipelines.py are in the same directory.")
+    raise SystemExit(f"Module import failed: {e}")
+# --- Constants & UI Configuration ---
+DEFAULT_SEED = 1024
+DEFAULT_STEPS_LOWRES = 30
+DEFAULT_GUIDANCE_LOWRES = 8.0
+DEFAULT_STRENGTH_LOWRES = 0.05
+DEFAULT_CN_SCALE_LOWRES = 1.0
+DEFAULT_STEPS_HIRES = 20
+DEFAULT_GUIDANCE_HIRES = 8.0
+DEFAULT_STRENGTH_HIRES = 0.75
+DEFAULT_CN_SCALE_HIRES = 1.0
+# OUTPUT_DIR = "outputs"
+# os.makedirs(OUTPUT_DIR, exist_ok=True)
+# --- Load Prerequisite Models at Startup ---
+if not are_models_loaded():
+    print("Initial model loading required...")
+    load_successful = load_models()
+    if not load_successful:
+        print("FATAL: Failed to load prerequisite models. The application may not work correctly.")
+else:
+    print("Models were already loaded.")
+# --- Main Processing Function ---
+def generate_full_pipeline(
+    input_image_path,
+    progress=gr.Progress(track_tqdm=True)
+    ):
+    """
+    Orchestrates the entire image generation workflow.
+    This function is called when the user clicks the 'Generate' button in the UI.
+    It takes inputs from the UI, calls the necessary processing steps in sequence
+    (prepare, detect pose, low-res gen, hi-res gen), updates the progress bar,
+    and returns the final generated image.
+    Args:
+        input_image_path (str): Path to the uploaded input image file.
+        seed (int): Random seed for generation.
+        steps_lowres (int): Inference steps for the low-resolution stage.
+        guidance_lowres (float): Guidance scale for the low-resolution stage.
+        strength_lowres (float): Img2Img strength for the low-resolution stage.
+        cn_scale_lowres (float): ControlNet scale for the low-resolution stage.
+        steps_hires (int): Inference steps per tile for the high-resolution stage.
+        guidance_hires (float): Guidance scale for the high-resolution stage.
+        strength_hires (float): Img2Img strength for the high-resolution stage.
+        cn_scale_hires (float): ControlNet scale for the high-resolution stage.
+        progress (gr.Progress): Gradio progress tracking object.
+    Returns:
+        PIL.Image.Image | None: The final generated high-resolution image,
+        or the low-resolution image as a fallback if
+        tiling fails, or None if critical errors occur early.
+    Raises:
+        gr.Error: If critical steps like image preparation or pose detection fail.
+        gr.Warning: If hi-res tiling fails but low-res succeeded (returns low-res).
+    """
+    print(f"\n--- Starting New Generation Run ---")
+    run_start_time = time.time()
+    current_seed = DEFAULT_SEED
+    if current_seed == -1:
+        current_seed = random.randint(0, 9999999)
+        print(f"Using Random Seed: {current_seed}")
+    else:
+        print(f"Using Fixed Seed: {current_seed}")
+    low_res_image = None
+    final_image = None
+    try:
+        progress(0.05, desc="Preparing Input Image...")
+        resized_input_image = prepare_image(input_image_path, target_size=512)
+        if resized_input_image is None:
+            raise gr.Error("Failed to load or prepare the input image. Check format/corruption.")
+        progress(0.15, desc="Detecting Pose...")
+        pose_map = run_pose_detection(resized_input_image)
+        if pose_map is None:
+            raise gr.Error("Failed to detect pose from the input image.")
+        # try: pose_map.save(os.path.join(OUTPUT_DIR, f"pose_map_{current_seed}.png"))
+        # except Exception as save_e: print(f"Warning: Could not save pose map: {save_e}")
+        progress(0.25, desc="Starting Low-Res Generation...")
+        low_res_image = run_low_res_generation(
+            resized_input_image=resized_input_image,
+            pose_map=pose_map,
+            seed=int(current_seed),
+            steps=int(DEFAULT_STEPS_LOWRES),
+            guidance_scale=float(DEFAULT_GUIDANCE_LOWRES),
+            strength=float(DEFAULT_STRENGTH_LOWRES),
+            controlnet_scale=float(DEFAULT_CN_SCALE_LOWRES),
+            progress=progress
+        )
+        print("Low-res generation stage completed successfully.")
+        # try: low_res_image.save(os.path.join(OUTPUT_DIR, f"lowres_output_{current_seed}.png"))
+        # except Exception as save_e: print(f"Warning: Could not save low-res image: {save_e}")
+        progress(0.45, desc="Low-Res Generation Complete.")
+        progress(0.50, desc="Starting Hi-Res Tiling...")
+        final_image = run_hires_tiling(
+            low_res_image=low_res_image,
+            seed=int(current_seed),
+            steps=int(DEFAULT_STEPS_HIRES),
+            guidance_scale=float(DEFAULT_GUIDANCE_HIRES),
+            strength=float(DEFAULT_STRENGTH_HIRES),
+            controlnet_scale=float(DEFAULT_CN_SCALE_HIRES),
+            upscale_factor=2,
+            tile_size=1024,
+            tile_stride=1024,
+            progress=progress
+        )
+        print("Hi-res tiling stage completed successfully.")
+        # try: final_image.save(os.path.join(OUTPUT_DIR, f"hires_output_{current_seed}.png"))
+        # except Exception as save_e: print(f"Warning: Could not save final image: {save_e}")
+        progress(1.0, desc="Complete!")
+    except gr.Error as e:
+        print(f"Gradio Error occurred: {e}")
+        if final_image is None and low_res_image is not None and ("tiling" in str(e).lower() or "hi-res" in str(e).lower()):
+            gr.Warning(f"High-resolution upscaling failed ({e}). Returning low-resolution image.")
+            final_image = low_res_image
+        else:
+            raise e
+    except Exception as e:
+        print(f"An unexpected error occurred in generate_full_pipeline: {e}")
+        import traceback
+        traceback.print_exc()
+        raise gr.Error(f"An unexpected error occurred: {e}")
+    finally:
+        print("Running final cleanup check...")
+        cleanup_memory()
+        run_end_time = time.time()
+        print(f"--- Full Pipeline Run Finished in {run_end_time - run_start_time:.2f} seconds ---")
+    return final_image
+# --- Gradio Interface Definition ---
+theme = gr_themes.Soft(primary_hue=gr_themes.colors.blue, secondary_hue=gr_themes.colors.sky)
+# New, improved Markdown description
+DESCRIPTION = f"""
+<div style="text-align: center;">
+    <h1 style="font-family: Impact, Charcoal, sans-serif; font-size: 280%; font-weight: 900; margin-bottom: 16px;">
+    Pose-Preserving Comicfier
+    </h1>
+    <p style="margin-bottom: 12; font-size: 94%">
+    Transform your photos into the gritty style of a 1940s Western comic! This app uses (Stable Diffusion + ControlNet)
+    to apply the artistic look while keeping the original pose intact. Just upload your image and click Generate!
+    </p>
+    <p style="font-size: 85%;"><em>(Generation can take several minutes on shared hardware. Prompts & parameters are fixed.)</em></p>
+    <p style="font-size: 80%; color: grey;">
+    <a href="https://github.com/mehran-khani" target="_blank">[View Project on GitHub]</a> |
+    <a href="https://huggingface.co/spaces/.../discussions" target="_blank">[Report an Issue]</a>
+    </p>
+    <!-- Remember to replace placeholders above with your actual links -->
+</div>
+"""
+EXAMPLE_IMAGES_DIR = "examples"
+EXAMPLE_IMAGES = [
+    os.path.join(EXAMPLE_IMAGES_DIR, "example1.jpg"),
+    os.path.join(EXAMPLE_IMAGES_DIR, "example2.jpg"),
+    os.path.join(EXAMPLE_IMAGES_DIR, "example3.jpg"),
+    os.path.join(EXAMPLE_IMAGES_DIR, "example4.jpg"),
+    os.path.join(EXAMPLE_IMAGES_DIR, "example5.jpg"),
+    os.path.join(EXAMPLE_IMAGES_DIR, "example6.jpg"),
+]
+EXAMPLE_IMAGES = [img for img in EXAMPLE_IMAGES if os.path.exists(img)]
+CUSTOM_CSS = """
+/* Target the container div Gradio uses for the Image component */
+.gradio-image {
+    width: 100%;   /* Ensure the container fills the column width */
+    height: 100%;  /* Ensure the container fills the height set by the component (e.g., height=400) */
+    overflow: hidden; /* Hide any potential overflow before object-fit applies */
+}
+/* Target the actual <img> tag inside the container */
+.gradio-image img {
+    display: block;    /* Remove potential bottom spacing */
+    width: 100%;       /* Force image width to match container */
+    height: 100%;      /* Force image height to match container */
+    object-fit: cover; /* Scale/crop image to cover this forced W/H */
+}
+footer { visibility: hidden }
+"""
+with gr.Blocks(theme=theme, css=CUSTOM_CSS, title="Pose-Preserving Comicfier") as demo:
+    gr.HTML(DESCRIPTION)
+    with gr.Row():
+        # Input Column
+        with gr.Column(scale=1, min_width=350):
+            # REMOVED height=400
+            input_image = gr.Image(
+                type="filepath",
+                label="Upload Your Image Here"
+            )
+            generate_button = gr.Button("Generate Comic Image", variant="primary")
+        # Output Column
+        with gr.Column(scale=1, min_width=350):
+             # REMOVED height=400
+            output_image = gr.Image(
+                type="pil",
+                label="Generated Comic Image",
+                interactive=False
+            )
+    # Examples Section
+    if EXAMPLE_IMAGES:
+        gr.Examples(
+            examples=EXAMPLE_IMAGES,
+            inputs=[input_image],
+            outputs=[output_image],
+            fn=generate_full_pipeline,
+            cache_examples=False
+        )
+    generate_button.click(
+        fn=generate_full_pipeline,
+        inputs=[input_image],
+        outputs=[output_image],
+        api_name="generate"
+    )
+# --- Launch the Gradio App ---
+if __name__ == "__main__":
+    if not are_models_loaded():
+        print("Attempting to load models before launch...")
+        if not load_models():
+             print("FATAL: Model loading failed on launch. App may not function.")
+    print("Attempting to launch Gradio demo...")
+    demo.queue().launch(debug=False, share=False)
+    print("Gradio app launched. Access it at the URL provided above.")

examples/example1.jpg ADDED Viewed

Git LFS Details

SHA256: 66c374f373be7ea050820ba736dde2cb6844e627d1095af8c68155388c5bc3ba
Pointer size: 132 Bytes
Size of remote file: 1.25 MB

examples/example2.jpg ADDED Viewed

Git LFS Details

SHA256: f68db724f0bc96d82525d6948be57d5ad2d43fd444ff2b66d36e7c1bbaea2443
Pointer size: 132 Bytes
Size of remote file: 3.19 MB

examples/example3.jpg ADDED Viewed

Git LFS Details

SHA256: e3a212b1b7f0de6044731814c7fd02a8e77aecc355f155a3e5002d07c64db726
Pointer size: 131 Bytes
Size of remote file: 803 kB

examples/example4.jpg ADDED Viewed

Git LFS Details

SHA256: 7bbf81b4d1b4eb69b78a969a7364ebd280d4e918c27d2fb327e5624994e9f0f5
Pointer size: 132 Bytes
Size of remote file: 2.48 MB

examples/example5.jpg ADDED Viewed

Git LFS Details

SHA256: 2e5783394cf58ce5f5725b54701e0ae90ed6595e9ae4bb6b53fd7cb08666885f
Pointer size: 132 Bytes
Size of remote file: 1.85 MB

examples/example6.jpg ADDED Viewed

Git LFS Details

SHA256: 6c663c8c65e8568be4241bc269204910a5c7c09878bcdb6872a523e2d6045889
Pointer size: 131 Bytes
Size of remote file: 480 kB

image_utils.py ADDED Viewed

	@@ -0,0 +1,134 @@

+"""
+Contains utility functions for image loading, preparation, and manipulation.
+Includes HEIC image format support via the optional 'pillow-heif' library.
+"""
+from PIL import Image, ImageOps, ImageDraw
+import os
+try:
+    from pillow_heif import register_heif_opener
+    register_heif_opener()
+    print("HEIC opener registered successfully using pillow-heif.")
+    _heic_support = True
+except ImportError:
+    print("Warning: pillow-heif not installed. HEIC/HEIF support will be disabled.")
+    _heic_support = False
+print("Loading Image Utils...")
+def prepare_image(image_filepath, target_size=512):
+    """
+    Prepares an input image file for the diffusion pipeline.
+    Loads an image from the given filepath (supports standard formats like
+    JPG, PNG, WEBP, and HEIC/HEIF),
+    ensures it's in RGB format, handles EXIF orientation, and performs
+    a forced resize to a square target_size, ignoring the original aspect ratio.
+    Args:
+        image_filepath (str): The path to the image file.
+        target_size (int): The target dimension for both width and height.
+    Returns:
+        PIL.Image.Image | None: The prepared image as a PIL Image object in RGB format,
+                               or None if loading or processing fails.
+    """
+    if image_filepath is None:
+        print("Warning: prepare_image received None filepath.")
+        return None
+    if not isinstance(image_filepath, str) or not os.path.exists(image_filepath):
+         print(f"Error: Invalid filepath provided to prepare_image: {image_filepath}")
+         if isinstance(image_filepath, Image.Image):
+             print("Warning: Received PIL Image instead of filepath, proceeding...")
+             image = image_filepath
+         else:
+            return None
+    else:
+        # --- Load Image from Filepath ---
+        print(f"Loading image from path: {image_filepath}")
+        try:
+            image = Image.open(image_filepath)
+        except ImportError as e:
+             print(f"ImportError during Image.open: {e}. Is pillow-heif installed?")
+             print("Cannot process image format.")
+             return None
+        except Exception as e:
+            print(f"Error opening image file {image_filepath} with PIL: {e}")
+            return None
+    # --- Process PIL Image ---
+    try:
+        image = ImageOps.exif_transpose(image)
+        image = image.convert("RGB")
+        original_width, original_height = image.size
+        final_width = target_size
+        final_height = target_size
+        resized_image = image.resize((final_width, final_height), Image.LANCZOS)
+        print(f"Original size: ({original_width}, {original_height}), FORCED Resized to: ({final_width}, {final_height})")
+        return resized_image
+    except Exception as e:
+        print(f"Error during PIL image processing steps: {e}")
+        return None
+def create_blend_mask(tile_size=1024, overlap=256):
+    """
+    Creates a feathered blending mask (alpha mask) for smooth tile stitching.
+    Generates a square mask where the edges have a linear gradient ramp within
+    the specified overlap zone, and the central area is fully opaque.
+    Assumes overlap occurs equally on all four sides.
+    Args:
+        tile_size (int): The dimension (width and height) of the tiles being processed.
+        overlap (int): The number of pixels that overlap between adjacent tiles.
+    Returns:
+        PIL.Image.Image: The blending mask as a PIL Image object in 'L' (grayscale) mode.
+                         White (255) areas are fully opaque, black (0) are transparent,
+                         gray values provide blending.
+    """
+    if overlap >= tile_size // 2:
+        print("Warning: Overlap is large relative to tile size, mask generation might be suboptimal.")
+        overlap = tile_size // 2 - 1
+    mask = Image.new("L", (tile_size, tile_size), 0)
+    draw = ImageDraw.Draw(mask)
+    if overlap > 0:
+        for i in range(overlap):
+            alpha = int(255 * (i / float(overlap)))
+            # Left edge ramp
+            draw.line([(i, 0), (i, tile_size)], fill=alpha)
+            # Right edge ramp
+            draw.line([(tile_size - 1 - i, 0), (tile_size - 1 - i, tile_size)], fill=alpha)
+            # Top edge ramp
+            draw.line([(0, i), (tile_size, i)], fill=alpha)
+            # Bottom edge ramp
+            draw.line([(0, tile_size - 1 - i), (tile_size, tile_size - 1 - i)], fill=alpha)
+    center_start = overlap
+    center_end_x = tile_size - overlap
+    center_end_y = tile_size - overlap
+    if center_end_x > center_start and center_end_y > center_start:
+        draw.rectangle( (center_start, center_start, center_end_x - 1, center_end_y - 1), fill=255 )
+    else:
+        center_x, center_y = tile_size // 2, tile_size // 2
+        draw.point((center_x, center_y), fill=255)
+        if tile_size % 2 == 0:
+             draw.point((center_x-1, center_y), fill=255)
+             draw.point((center_x, center_y-1), fill=255)
+             draw.point((center_x-1, center_y-1), fill=255)
+    print(f"Blend mask created (Size: {tile_size}x{tile_size}, Overlap: {overlap})")
+    return mask

loras/add_detail.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:47aaaf0d2945ca937151d61304946dd229b3f072140b85484bc93e38f2a6e2f7
+size 37861176

loras/night_comic_V06.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:caf9280080bf4183a9064547a9374bc22575248867a9bcc54a883305b57a8ebb
+size 14153788

model_loader.py ADDED Viewed

	@@ -0,0 +1,133 @@

+"""
+Handles the loading and management of necessary AI models from Hugging Face Hub.
+Provides functions to load models once at startup and access them throughout
+the application, managing device placement (CPU/GPU) and data types.
+Optimized for typical Hugging Face Space GPU environments.
+"""
+import torch
+from diffusers import ControlNetModel
+from controlnet_aux import OpenposeDetector
+import gc
+# --- Configuration ---
+# Automatically detect CUDA availability and set appropriate device/dtype
+if torch.cuda.is_available():
+    DEVICE = "cuda"
+    DTYPE = torch.float16
+    print(f"CUDA available. Using Device: {DEVICE}, Dtype: {DTYPE}")
+    try:
+        print(f"GPU Name: {torch.cuda.get_device_name(0)}")
+    except Exception as e:
+        print(f"Couldn't get GPU name: {e}")
+else:
+    DEVICE = "cpu"
+    DTYPE = torch.float32
+    print(f"CUDA not available. Using Device: {DEVICE}, Dtype: {DTYPE}")
+# Model IDs from Hugging Face Hub
+# BASE_MODEL_ID = "runwayml/stable-diffusion-v1-5" # Base SD model ID needed by pipelines
+OPENPOSE_DETECTOR_ID = 'lllyasviel/ControlNet' # Preprocessor model repo
+CONTROLNET_POSE_MODEL_ID = "lllyasviel/sd-controlnet-openpose" # OpenPose ControlNet weights
+CONTROLNET_TILE_MODEL_ID = "lllyasviel/control_v11f1e_sd15_tile" # Tile ControlNet weights
+_openpose_detector = None
+_controlnet_pose = None
+_controlnet_tile = None
+_models_loaded = False
+# --- Loading Function ---
+def load_models(force_reload=False):
+    """
+    Loads the OpenPose detector (to CPU) and ControlNet models (to configured DEVICE).
+    This function should typically be called once when the application starts.
+    It checks if models are already loaded to prevent redundant loading unless
+    `force_reload` is True.
+    Args:
+        force_reload (bool): If True, forces reloading even if models are already loaded.
+    Returns:
+        bool: True if all models were loaded successfully (or already were), False otherwise.
+    """
+    global _openpose_detector, _controlnet_pose, _controlnet_tile, _models_loaded
+    if _models_loaded and not force_reload:
+        print("Models already loaded.")
+        return True
+    print(f"--- Loading Models ---")
+    if DEVICE == "cuda":
+        print("Performing initial CUDA cache clear...")
+        gc.collect()
+        torch.cuda.empty_cache()
+    # 1. OpenPose Detector
+    try:
+        print(f"Loading OpenPose Detector from {OPENPOSE_DETECTOR_ID} to CPU...")
+        _openpose_detector = OpenposeDetector.from_pretrained(OPENPOSE_DETECTOR_ID)
+        print("OpenPose detector loaded successfully (on CPU).")
+    except Exception as e:
+        print(f"ERROR: Failed to load OpenPose Detector: {e}")
+        _models_loaded = False
+        return False
+    # 2. ControlNet Models
+    try:
+        print(f"Loading ControlNet Pose Model from {CONTROLNET_POSE_MODEL_ID} to {DEVICE} ({DTYPE})...")
+        _controlnet_pose = ControlNetModel.from_pretrained(
+            CONTROLNET_POSE_MODEL_ID, torch_dtype=DTYPE
+        )
+        _controlnet_pose.to(DEVICE)
+        print("ControlNet Pose model loaded successfully.")
+    except Exception as e:
+        print(f"ERROR: Failed to load ControlNet Pose Model: {e}")
+        _models_loaded = False
+        return False
+    try:
+        print(f"Loading ControlNet Tile Model from {CONTROLNET_TILE_MODEL_ID} to {DEVICE} ({DTYPE})...")
+        _controlnet_tile = ControlNetModel.from_pretrained(
+            CONTROLNET_TILE_MODEL_ID, torch_dtype=DTYPE
+        )
+        _controlnet_tile.to(DEVICE)
+        print("ControlNet Tile model loaded successfully.")
+    except Exception as e:
+        print(f"ERROR: Failed to load ControlNet Tile Model: {e}")
+        _models_loaded = False
+        return False
+    _models_loaded = True
+    print("--- All prerequisite models loaded successfully. ---")
+    if DEVICE == "cuda":
+        print("Performing post-load CUDA cache clear...")
+        gc.collect()
+        torch.cuda.empty_cache()
+    return True
+# --- Getter Functions ---
+def get_openpose_detector():
+    if not _models_loaded: load_models()
+    return _openpose_detector
+def get_controlnet_pose():
+    if not _models_loaded: load_models()
+    return _controlnet_pose
+def get_controlnet_tile():
+    if not _models_loaded: load_models()
+    return _controlnet_tile
+def get_device():
+    return DEVICE
+def get_dtype():
+    return DTYPE
+def are_models_loaded():
+    return _models_loaded

pipelines.py ADDED Viewed

	@@ -0,0 +1,433 @@

+"""
+Contains functions to execute the main image generation stages:
+1. OpenPose Detection: Extracts pose information.
+2. Low-Resolution Generation: Creates initial image using Pose ControlNet.
+3. High-Resolution Tiling: Upscales the low-res image using Tile ControlNet.
+Manages dynamic loading/unloading of diffusion pipelines to conserve VRAM.
+"""
+import torch
+import gc
+import time
+import os
+from PIL import Image
+from tqdm.auto import tqdm
+import gradio as gr
+from diffusers import (
+    StableDiffusionControlNetImg2ImgPipeline,
+    UniPCMultistepScheduler,
+)
+from model_loader import (
+    get_openpose_detector,
+    get_controlnet_pose,
+    get_controlnet_tile,
+    get_device,
+    get_dtype,
+    are_models_loaded,
+)
+from image_utils import create_blend_mask
+from prompts import get_prompts_for_run
+# --- Configuration ---
+BASE_MODEL_ID = "runwayml/stable-diffusion-v1-5"
+LORA_DIR = "loras"
+LORA_FILES = {
+    "style": os.path.join(LORA_DIR, "night_comic_V06.safetensors"),
+    "detail": os.path.join(LORA_DIR, "add_detail.safetensors"),
+}
+LORA_WEIGHTS_LOWRES = [1, 1]
+LORA_WEIGHTS_HIRES = [1, 2]
+ACTIVE_ADAPTERS = ["style", "detail"]
+def cleanup_memory():
+    """Forces garbage collection and clears CUDA cache."""
+    gc.collect()
+    if torch.cuda.is_available():
+        torch.cuda.empty_cache()
+# --- Stage 1: OpenPose Detection ---
+def run_pose_detection(resized_input_image):
+    """
+    Detects human pose (body, hands, face) from the input image using OpenPose.
+    Temporarily moves the OpenPose detector model to the active GPU (if available)
+    for processing and then moves it back to the CPU to conserve VRAM.
+    Args:
+        input_image_resized (PIL.Image.Image): The input image, already resized
+        and in RGB format.
+    Returns:
+        PIL.Image.Image | None: A PIL Image representing the detected pose map,
+        or None if detection fails or models aren't loaded.
+    """
+    if not are_models_loaded():
+        print("Error: Cannot run pose detection, models not loaded.")
+        return None
+    detector = get_openpose_detector()
+    device = get_device()
+    control_image_openpose = None
+    if detector is None:
+        print("Error: OpenPose detector is None.")
+        return None
+    try:
+        detector.to(device)
+        cleanup_memory()
+        control_image_openpose = detector(
+            resized_input_image, include_face=True, include_hand=True
+        )
+    except Exception as e:
+        print(f"ERROR during OpenPose detection: {e}")
+        control_image_openpose = None
+    finally:
+        detector.to("cpu")
+        cleanup_memory()
+    return control_image_openpose
+# --- Stage 2: Low-Resolution Generation ---
+def run_low_res_generation(
+    resized_input_image,
+    pose_map,
+    seed,
+    steps,
+    guidance_scale,
+    strength,
+    controlnet_scale=0.8,
+    progress=gr.Progress(track_tqdm=True)
+    ):
+    """
+    Generates the initial low-resolution image using Img2Img with Pose ControlNet.
+    Dynamically loads the StableDiffusionControlNetImg2ImgPipeline, applies LoRAs,
+    runs inference, and then unloads the pipeline to free VRAM before returning.
+    Args:
+        input_image_resized (PIL.Image.Image): The resized input image.
+        pose_map (PIL.Image.Image): The pose map generated by run_pose_detection.
+        seed (int): The random seed for generation.
+        steps (int): Number of diffusion inference steps.
+        guidance_scale (float): Classifier-free guidance scale.
+        strength (float): Img2Img strength (0.0 to 1.0). How much noise to add.
+        controlnet_scale (float): Conditioning scale for the Pose ControlNet.
+        progress (gr.Progress): Gradio progress object for UI updates.
+    Returns:
+        PIL.Image.Image | None: The generated low-resolution PIL Image, or None if an error occurs.
+    Raises:
+        gr.Error: Raises a Gradio error if generation fails catastrophically.
+    """
+    if not are_models_loaded() or pose_map is None:
+        error_msg = "Cannot run low-res generation: "
+        if not are_models_loaded(): error_msg += "Models not loaded. "
+        if pose_map is None: error_msg += "Pose map is missing."
+        print(f"Error: {error_msg}")
+        return None
+    device = get_device()
+    dtype = get_dtype()
+    controlnet_pose = get_controlnet_pose()
+    output_image_lowres = None
+    pipe_lowres = None
+    positive_prompt, negative_prompt, _, _ = get_prompts_for_run()
+    generator = torch.Generator(device=device).manual_seed(int(seed))
+    progress(0, desc="Loading Low-Res Pipeline...")
+    try:
+        # 1. Load Pipeline
+        pipe_lowres = StableDiffusionControlNetImg2ImgPipeline.from_pretrained(
+            BASE_MODEL_ID,
+            controlnet=controlnet_pose,
+            torch_dtype=dtype,
+            safety_checker=None
+        )
+        pipe_lowres.scheduler = UniPCMultistepScheduler.from_config(pipe_lowres.scheduler.config)
+        pipe_lowres.to(device)
+        cleanup_memory()
+        # 2. Load LoRAs
+        if os.path.exists(LORA_FILES["style"]) and os.path.exists(LORA_FILES["detail"]):
+            pipe_lowres.load_lora_weights(LORA_FILES["style"], adapter_name="style")
+            pipe_lowres.load_lora_weights(LORA_FILES["detail"], adapter_name="detail")
+            pipe_lowres.set_adapters(ACTIVE_ADAPTERS, adapter_weights=LORA_WEIGHTS_LOWRES)
+            print(f"Activated LoRAs: {ACTIVE_ADAPTERS} with weights {LORA_WEIGHTS_LOWRES}")
+        else:
+            print("Warning: One or both LoRA files not found. Skipping LoRA loading.")
+            raise gr.Error("Required LoRA files not found in loras/ directory.")
+        # 3. Run Inference
+        progress(0.3, desc="Generating Low-Res Image...")
+        output_image_low_res = pipe_lowres(
+            prompt=positive_prompt,
+            negative_prompt=negative_prompt,
+            image=resized_input_image,
+            control_image=pose_map,
+            num_inference_steps=int(steps),
+            strength=strength,
+            guidance_scale=guidance_scale,
+            controlnet_conditioning_scale=float(controlnet_scale),
+            generator=generator,
+        ).images[0]
+        progress(0.9, desc="Low-Res Complete")
+    except Exception as e:
+        print(f"ERROR during Low-Res Generation Pipeline: {e}")
+        import traceback
+        traceback.print_exc()
+        output_image_low_res = None
+        raise gr.Error(f"Failed during low-res generation: {e}")
+    finally:
+        # 4. Cleanup Pipeline
+        print("Cleaning up Low-Res pipeline...")
+        if pipe_lowres is not None:
+            try:
+                if hasattr(pipe_lowres, 'get_active_adapters') and pipe_lowres.get_active_adapters():
+                    print("Unloading LoRAs...")
+                    pipe_lowres.unload_lora_weights()
+            except Exception as unload_e:
+                print(f"Note: Error unloading LoRAs: {unload_e}")
+            print("Moving Low-Res pipe components to CPU before deleting...")
+            try: pipe_lowres.to('cpu')
+            except Exception as cpu_e: print(f"Note: Error moving pipe to CPU: {cpu_e}")
+            print("Deleting Low-Res pipeline object...")
+            del pipe_lowres
+            pipe_lowres = None
+        print("Running garbage collection and emptying CUDA cache after Low-Res...")
+        cleanup_memory()
+        # time.sleep(1)
+    print("--- Low-Res Generation Stage Finished ---")
+    return output_image_low_res
+# --- Stage 3: High-Resolution Tiling Upscaling ---
+def run_hires_tiling(
+    low_res_image,
+    seed,
+    steps,
+    guidance_scale,
+    strength,
+    controlnet_scale=1.0,
+    upscale_factor=2,
+    tile_size=1024,
+    tile_stride=1024,
+    progress=gr.Progress(track_tqdm=True)
+    ):
+    """
+    Upscales the low-resolution image using tiling with the Tile ControlNet.
+    Dynamically loads the StableDiffusionControlNetImg2ImgPipeline for tiling,
+    applies LoRAs, processes the image in overlapping tiles, blends the results,
+    and unloads the pipeline to free VRAM.
+    Args:
+        low_res_image (PIL.Image.Image): The low-resolution image from the previous stage.
+        seed (int): The random seed (should ideally match low-res stage seed).
+        steps (int): Number of diffusion inference steps per tile.
+        guidance_scale (float): Classifier-free guidance scale for tiles.
+        strength (float): Img2Img strength for tiling (controls detail vs. original).
+        controlnet_scale (float): Conditioning scale for the Tile ControlNet.
+        upscale_factor (int): Factor by which to increase the image resolution.
+        tile_size (int): Size of the square tiles to process.
+        tile_stride (int): Step size between tiles. Overlap = tile_size - tile_stride.
+        progress (gr.Progress): Gradio progress object for UI updates.
+    Returns:
+        PIL.Image.Image | None: The generated high-resolution PIL Image, or None if an error occurs.
+    Raises:
+        gr.Error: Raises a Gradio error if tiling fails catastrophically.
+    """
+    if not are_models_loaded() or low_res_image is None:
+        error_msg = "Cannot run hi-res tiling: "
+        if not are_models_loaded(): error_msg += "Models not loaded. "
+        if low_res_image is None: error_msg += "Low-res image is missing."
+        print(f"Error: {error_msg}")
+        return None
+    device = get_device()
+    dtype = get_dtype()
+    controlnet_tile = get_controlnet_tile()
+    high_res_output_image = None
+    pipe_hires = None
+    _, _, positive_prompt_tile, negative_prompt_tile = get_prompts_for_run()
+    generator_tile = torch.Generator(device=device).manual_seed(int(seed))
+    print("\n--- Starting Hi-Res Tiling Stage ---")
+    progress(0, desc="Preparing for Tiling...")
+    try:
+        # --- Setup Tiling Parameters ---
+        target_width = low_res_image.width * upscale_factor
+        target_height = low_res_image.height * upscale_factor
+        if tile_size > min(target_width, target_height):
+            print(f"Warning: Tile size ({tile_size}) > target dimension ({target_width}x{target_height}). Clamping tile size.")
+            tile_size = min(target_width, target_height)
+            tile_stride = tile_size
+        overlap = tile_size - tile_stride
+        if overlap < 0:
+             print("Warning: Tile stride is larger than tile size. Setting stride = tile size.")
+             tile_stride = tile_size
+             overlap = 0
+        print(f"Target Res: {target_width}x{target_height}, Tile Size: {tile_size}, Stride: {tile_stride}, Overlap: {overlap}")
+        # 1. Load Pipeline
+        print(f"Loading Hi-Res Pipeline ({BASE_MODEL_ID} + Tile ControlNet)...")
+        progress(0.05, desc="Loading Hi-Res Pipeline...")
+        pipe_hires = StableDiffusionControlNetImg2ImgPipeline.from_pretrained(
+            BASE_MODEL_ID,
+            controlnet=controlnet_tile,
+            torch_dtype=dtype,
+            safety_checker=None,
+        )
+        pipe_hires.scheduler = UniPCMultistepScheduler.from_config(pipe_hires.scheduler.config)
+        pipe_hires.to(device)
+        # pipe_hires.enable_model_cpu_offload()
+        # pipe_hires.enable_xformers_memory_efficient_attention()
+        print("Hi-Res Pipeline loaded to GPU.")
+        cleanup_memory()
+        # 2. Load LoRAs
+        print("Loading LoRAs for Hi-Res pipe...")
+        if os.path.exists(LORA_FILES["style"]) and os.path.exists(LORA_FILES["detail"]):
+            pipe_hires.load_lora_weights(LORA_FILES["style"], adapter_name="style")
+            pipe_hires.load_lora_weights(LORA_FILES["detail"], adapter_name="detail")
+            pipe_hires.set_adapters(ACTIVE_ADAPTERS, adapter_weights=LORA_WEIGHTS_HIRES)
+            print(f"Activated LoRAs: {ACTIVE_ADAPTERS} with weights {LORA_WEIGHTS_HIRES}")
+        else:
+             print("Warning: One or both LoRA files not found. Skipping LoRA loading.")
+             raise gr.Error("Required LoRA files not found in loras/ directory.")
+        # --- Prepare for Tiling Loop ---
+        print(f"Creating blurry base image ({target_width}x{target_height})...")
+        progress(0.15, desc="Preparing Base Image...")
+        blurry_high_res = low_res_image.resize((target_width, target_height), Image.LANCZOS)
+        final_image = Image.new("RGB", (target_width, target_height))
+        blend_mask = create_blend_mask(tile_size, overlap)
+        num_tiles_x = (target_width + tile_stride - 1) // tile_stride
+        num_tiles_y = (target_height + tile_stride - 1) // tile_stride
+        total_tiles = num_tiles_x * num_tiles_y
+        print(f"Processing {num_tiles_x}x{num_tiles_y} = {total_tiles} tiles...")
+        # --- Tiling Loop ---
+        progress(0.2, desc=f"Processing Tiles (0/{total_tiles})")
+        processed_tile_count = 0
+        with tqdm(total=total_tiles, desc="Tiling Upscale") as pbar:
+            for y in range(num_tiles_y):
+                for x in range(num_tiles_x):
+                    tile_start_time = time.time()
+                    pbar.set_description(f"Tiling Upscale (Tile {processed_tile_count+1}/{total_tiles})")
+                    x_start = x * tile_stride
+                    y_start = y * tile_stride
+                    x_end = min(x_start + tile_size, target_width)
+                    y_end = min(y_start + tile_size, target_height)
+                    crop_box = (x_start, y_start, x_end, y_end)
+                    tile_image_blurry = blurry_high_res.crop(crop_box)
+                    current_tile_width, current_tile_height = tile_image_blurry.size
+                    if current_tile_width < tile_size or current_tile_height < tile_size:
+                        try: edge_color = tile_image_blurry.getpixel((0, 0))
+                        except IndexError: edge_color = (127, 127, 127)
+                        padded_tile = Image.new("RGB", (tile_size, tile_size), edge_color)
+                        padded_tile.paste(tile_image_blurry, (0, 0))
+                        tile_image_blurry = padded_tile
+                        print(f"Padded edge tile at ({x},{y})")
+                    # 3. Run Inference on the Tile
+                    with torch.inference_mode():
+                        output_tile = pipe_hires(
+                            prompt=positive_prompt_tile,
+                            negative_prompt=negative_prompt_tile,
+                            image=tile_image_blurry,
+                            control_image=tile_image_blurry,
+                            num_inference_steps=int(steps),
+                            strength=strength,
+                            guidance_scale=guidance_scale,
+                            controlnet_conditioning_scale=float(controlnet_scale),
+                            generator=generator_tile,
+                            output_type="pil"
+                        ).images[0]
+                    # --- Stitch Tile Back ---
+                    paste_x = x_start
+                    paste_y = y_start
+                    crop_w = x_end - x_start
+                    crop_h = y_end - y_start
+                    output_tile_region = output_tile.crop((0, 0, crop_w, crop_h))
+                    if overlap > 0:
+                        blend_mask_region = blend_mask.crop((0, 0, crop_w, crop_h))
+                        current_content_region = final_image.crop((paste_x, paste_y, paste_x + crop_w, paste_y + crop_h))
+                        blended_tile_region = Image.composite(output_tile_region, current_content_region, blend_mask_region)
+                        final_image.paste(blended_tile_region, (paste_x, paste_y))
+                    else:
+                         final_image.paste(output_tile_region, (paste_x, paste_y))
+                    processed_tile_count += 1
+                    pbar.update(1)
+                    # Update Gradio progress
+                    gradio_progress = 0.2 + 0.75 * (processed_tile_count / total_tiles)
+                    progress(gradio_progress, desc=f"Processing Tile {processed_tile_count}/{total_tiles}")
+                    tile_end_time = time.time()
+                    print(f"Tile ({x},{y}) processed in {tile_end_time - tile_start_time:.2f}s")
+                    # cleanup_memory()
+        print("Tile processing complete.")
+        high_res_output_image = final_image
+        progress(0.95, desc="Tiling Complete")
+    except Exception as e:
+        print(f"ERROR during Hi-Res Tiling Pipeline: {e}")
+        import traceback
+        traceback.print_exc()
+        high_res_output_image = None
+        raise gr.Error(f"Failed during hi-res tiling: {e}")
+    finally:
+        # 4. Cleanup Pipeline
+        print("Cleaning up Hi-Res pipeline...")
+        if pipe_hires is not None:
+            try:
+                if hasattr(pipe_hires, 'get_active_adapters') and pipe_hires.get_active_adapters():
+                     print("Unloading LoRAs...")
+                     pipe_hires.unload_lora_weights()
+            except Exception as unload_e:
+                 print(f"Note: Error unloading LoRAs: {unload_e}")
+            print("Moving Hi-Res pipe components to CPU before deleting...")
+            try: pipe_hires.to('cpu')
+            except Exception as cpu_e: print(f"Note: Error moving pipe to CPU: {cpu_e}")
+            print("Deleting Hi-Res pipeline object...")
+            del pipe_hires
+            pipe_hires = None
+        print("Running garbage collection and emptying CUDA cache after Hi-Res...")
+        cleanup_memory()
+    print("--- Hi-Res Tiling Stage Finished ---")
+    return high_res_output_image

prompts.py ADDED Viewed

	@@ -0,0 +1,53 @@

+import random
+"""
+Defines fixed prompts and provides a function to generate
+randomized prompts for each run, mirroring the original notebook behavior.
+Used by the main pipeline functions.
+"""
+BASE_PROMPT = "detailed face portrait, accurate facial features, natural features, clear eyes, keep the gender same as the input image"
+STYLE_PROMPT = r"((LIMITED PALETTE)), ((RETRO COMIC)), ((1940S \(STYLE\))), ((WESTERN COMICS \(STYLE\))), ((NIGHT COMIC)), detailed illustration, sharp lines, sfw"
+BASE_NEGATIVE_PROMPT = (
+    "generic face, distorted features, unrealistic face, bad anatomy, extra limbs, fused fingers, poorly drawn hands, poorly drawn face, "
+    "text, signature, watermark, letters, words, username, artist name, speech bubble, multiple panels, "
+    "ugly, disfigured, deformed, low quality, worst quality, blurry, jpeg artifacts, noisy, "
+    "weapon, gun, knife, violence, gore, blood, injury, mutilated, horrific, nsfw, nude, naked, explicit, sexual, lingerie, bikini, suggestive, provocative, disturbing, scary, offensive, illegal, unlawful"
+)
+# --- Background Generation Elements ---
+BG_SETTINGS = [
+    "on a futuristic city street at night", "in a retro sci-fi control room", "in a dusty western saloon",
+    "in front of an abstract energy field", "in a neon-lit alleyway", "in a stark cyberpunk cityscape",
+    "with speed lines background", "in a manga panel frame", "in a dimly lit laboratory",
+    "against a dramatic explosive background", "in a cluttered artist studio", "in a dynamic action scene"
+]
+BG_DETAILS = [
+    "detailed background", "cinematic lighting", "dramatic shadows",
+    "high contrast", "low angle shot", "dynamic composition", "atmospheric perspective", "intricate details"
+]
+def get_prompts_for_run():
+    """
+    Generates the prompts needed for one generation cycle,
+    including a newly randomized background for the low-res stage.
+    Returns prompts suitable for low-res and hi-res stages.
+    """
+    # --- Low-Res Prompt Generation ---
+    chosen_bg_setting = random.choice(BG_SETTINGS)
+    chosen_bg_detail = random.choice(BG_DETAILS)
+    background_prompt = f"{chosen_bg_setting}, {chosen_bg_detail}"
+    positive_prompt_lowres = f"{BASE_PROMPT}, {STYLE_PROMPT}, {background_prompt}"
+    # --- Tile Prompt Generation ---
+    positive_prompt_tile = f"{BASE_PROMPT}, {STYLE_PROMPT}"
+    negative_prompt_tile = (
+        BASE_NEGATIVE_PROMPT +
+        ", blurry face, distorted face, mangled face, bad face, low quality, blurry"
+    )
+    return positive_prompt_lowres, BASE_NEGATIVE_PROMPT, positive_prompt_tile, negative_prompt_tile

requirements.txt ADDED Viewed

	@@ -0,0 +1,22 @@

+# Base ML
+torch==2.7.0
+torchvision==0.22.0
+torchaudio==2.7.0
+accelerate==1.6.0
+# Diffusers & Transformers
+diffusers==0.33.1
+transformers==4.51.3
+peft==0.15.2
+# ControlNet & Auxiliaries
+controlnet_aux==0.0.9
+mediapipe
+matplotlib
+opencv-python-headless==4.11.0.86
+Pillow==11.2.1
+pillow-heif==0.22.0
+numpy==2.2.5
+# Web UI
+gradio==5.29.0