π What is Imagine?
Imagine is an all-in-one framework for creating visually stunning posters, blending:
- Precise and accurate text rendering
- Seamless integration of abstract art
- Bold, eye-catching layouts
- A cohesive and harmonious visual style
π Quick Start
π§ Installation
# Clone the repository
git clone https://github.com/skylinemusiccds/Imagine.git
cd Imagine
# Create conda environment
conda create -n imagine python=3.11
conda activate imagine
# Install dependencies
pip install -r requirements.txt
π Easy Usage
Imagine offers a modular and adaptable framework that seamlessly fits into custom workflows or interoperates with other compatible systems. Its design prioritizes ease of use and flexibility, making integration effortless.
Loading the model is quick and intuitive:
import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
# 1. Define model IDs and settings
pipeline_id = "black-forest-labs/FLUX.1-dev"
imagine_transformer_id = "Satyam-Singh/Imagine"
device = "cuda"
dtype = torch.bfloat16
# 2. Load the base pipeline
pipe = FluxPipeline.from_pretrained(pipeline_id, torch_dtype=dtype)
# 3. The key step: simply replace the original transformer with our Imagine model
pipe.transformer = FluxTransformer2DModel.from_pretrained(
imagine_transformer_id,
torch_dtype=dtype
)
pipe.to(device)
# Now, `pipe` is a standard diffusers pipeline ready for inference with your own logic.
π Quick Generation
For the best results, we recommend using the provided inference.py
script, which includes our intelligent prompt rewriting feature. This enhancement automatically refines your input to generate more compelling and visually stunning results.
Generate Posters with Precision
Create high-quality aesthetic posters from your prompt using BF16
precision for improved performance and efficiency.
π Get started by visiting our GitHub repository.
python inference.py \
--prompt "Urban Canvas Street Art Expo poster with bold graffiti lettering and vibrant, dynamic color splashes capturing the energy of street art." \
--enable_recap \
--num_inference_steps 28 \
--guidance_scale 3.5 \
--seed 42 \
--pipeline_path "black-forest-labs/FLUX.1-dev" \
--custom_transformer_path "Satyam-Singh/Imagine" \
--qwen_model_path "Qwen/Qwen3-8B"
If you are running on a GPU with limited memory, you can use inference_offload.py
to offload some components to the CPU:
python inference_offload.py \
--prompt "Urban Canvas Street Art Expo poster with bold graffiti lettering and vibrant, dynamic color splashes capturing the energy of street art." \
--enable_recap \
--num_inference_steps 28 \
--guidance_scale 3.5 \
--seed 42 \
--pipeline_path "black-forest-labs/FLUX.1-dev" \
--custom_transformer_path "Satyam-Singh/Imagine" \
--qwen_model_path "Qwen/Qwen3-8B"
π» Gradio Web UI
We provide a Gradio web UI for Imagine, please refer to our GitHub repository.
python demo_gradio.py
π Performance Benchmarks
π Quantitative Results
Method | Text Recall β | Text F-score β | Text Accuracy β |
---|---|---|---|
OpenCOLE (Open) | 0.082 | 0.076 | 0.061 |
Playground-v2.5 (Open) | 0.157 | 0.146 | 0.132 |
SD3.5 (Open) | 0.565 | 0.542 | 0.497 |
Flux1.dev (Open) | 0.723 | 0.707 | 0.667 |
Ideogram-v2 (Close) | 0.711 | 0.685 | 0.680 |
BAGEL (Open) | 0.543 | 0.536 | 0.463 |
Gemini2.0-Flash-Gen (Close) | 0.798 | 0.786 | 0.746 |
Imagine (ours) | 0.787 | 0.774 | 0.735 |
π Citation
If you find Imagine useful for your research, please cite our paper:
@article{LLaVA : !magine,
title={LLaVA Imagine: Words to Visuals},
author={Satyam Singh, UniVerse Ai},
year={2025}
}
- Downloads last month
- 14