Model card
We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096 × 4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU.
Source code is available at https://github.com/NVlabs/Sana.
🧨 Diffusers
1. How to use SanaControlNetPipeline
with 🧨diffusers
# run `pip install git+https://github.com/huggingface/diffusers` before use Sana in diffusers
import torch
from diffusers import SanaControlNetModel, SanaControlNetPipeline
from diffusers.utils import load_image
controlnet = SanaControlNetModel.from_pretrained(
"ishan24/Sana_600M_1024px_ControlNet_diffusers",
torch_dtype=torch.float16
)
pipe = SanaControlNetPipeline.from_pretrained(
"Efficient-Large-Model/Sana_600M_1024px_diffusers",
variant="fp16",
controlnet=controlnet,
torch_dtype=torch.float16,
)
pipe.to('cuda')
pipe.vae.to(torch.bfloat16)
pipe.text_encoder.to(torch.bfloat16)
cond_image = load_image(
"https://huggingface.co/ishan24/Sana_600M_1024px_ControlNet_diffusers/resolve/main/hed_example.png"
)
prompt='a cat with a neon sign that says "Sana"'
image = pipe(
prompt,
control_image=cond_image,
).images[0]
image.save("sana.png")
- Downloads last month
- 26
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support