Sayak Paul's picture

Sayak Paul

sayakpaul

·

https://sayak.dev

AI & ML interests

Diffusion models, representation learning

Recent Activity

updated a dataset about 2 hours ago

huggingface/diffusers-metadata

liked a model about 3 hours ago

nunchaku-tech/nunchaku-qwen-image

updated a model 2 days ago

diffusers-internal-dev/gemini-prompt-expander

View all activity

Organizations

Posts 24

Post

1028

Fast LoRA inference for Flux with Diffusers and PEFT 🚨

There are great materials that demonstrate how to optimize inference for popular image generation models, such as Flux. However, very few cover how to serve LoRAs fast, despite LoRAs being an inseparable part of their adoption.

In our latest post, @BenjaminB and I show different techniques to optimize LoRA inference for the Flux family of models for image generation. Our recipe includes the use of:

1. torch.compile
2. Flash Attention 3 (when compatible)
3. Dynamic FP8 weight quantization (when compatible)
4. Hotswapping for avoiding recompilation during swapping new LoRAs 🤯

We have tested our recipe with Flux.1-Dev on both H100 and RTX 4090. We achieve at least a *2x speedup* in either of the GPUs. We believe our recipe is grounded in the reality of how LoRA-based use cases are generally served. So, we hope this will be beneficial to the community 🤗

Even though our recipe was tested primarily with NVIDIA GPUs, it should also work with AMD GPUs.

Learn the details and the full code here:
https://huggingface.co/blog/lora-fast

Articles 31

Article

45

Fast LoRA inference for Flux with Diffusers and PEFT

View all Articles

Collections 2

Papers 16

arxiv:2505.10046

arxiv:2504.16080

arxiv:2503.09641

arxiv:2412.03895

spaces 19

Civitai To Hub

Upload checkpoints from CivitAI to Hugging Face Hub.

Grade Images with Gemini

Uses Gemini 2.0 Flash to grade images.

Github Release Notes for Diffusers

Generate release notes for Hugging Face Diffusers

Demo Docker Gradio

Analyze images and receive labels

Convert Kerascv SD to Diffusers

Generate Custom Pokemons with Stable Diffusion

models 58

sayakpaul/qwen-gguf

20B • Updated 15 days ago • 22

sayakpaul/different-lora-from-civitai

12B • Updated 15 days ago • 92 • 1

sayakpaul/flux-diffusers-gguf

12B • Updated Jun 10 • 75 • 1

sayakpaul/mini-t2v-verse-with-t5-embeddings

sayakpaul/trained-lumina2-lora-yarn

Text-to-Image • Updated Feb 20 • 20 • 3

sayakpaul/vjepa-ckpts

sayakpaul/FLUX.1-dev-edit-v0

Text-to-Image • Updated Jan 21 • 54 • 46

sayakpaul/cartoon-control-lr_1e-4-wd_1e-4-gs_10.0-cd_0.1

Text-to-Image • Updated Jan 5 • 12 • 6

sayakpaul/q8-ltx-video

Updated Jan 2 • 9 • 7

sayakpaul/yarn_art_lora_sana

Text-to-Image • Updated Dec 16, 2024 • 7 • 1

datasets 22

sayakpaul/sample-datasets

Viewer • Updated 11 days ago • 6 • 23.4k • 1

sayakpaul/butteflies_with_classes

Preview • Updated May 27 • 8.3k

sayakpaul/OmniEdit-mini

Viewer • Updated Jan 5 • 21.1k • 996 • 4

sayakpaul/video-dataset-disney-organized

Viewer • Updated Nov 29, 2024 • 69 • 353 • 5

sayakpaul/pick-a-pic-v2-unique-prompts

Viewer • Updated Nov 9, 2024 • 59k • 91 • 1

sayakpaul/poses-controlnet-dataset

Viewer • Updated Aug 29, 2024 • 496 • 45 • 6

sayakpaul/torchao-diffusers

Updated Aug 28, 2024 • 247

sayakpaul/pickapic_v2_webdataset

Viewer • Updated Apr 4, 2024 • 8.7k • 982 • 2

sayakpaul/no_robots_only_coding

Viewer • Updated Mar 20, 2024 • 350 • 11 • 1

sayakpaul/diffusers-qa-chatbot-artifacts

Viewer • Updated Mar 9, 2024 • 265k • 392 • 1

View 22 datasets