2 1 94

HannibalY PRO

ethix

AI & ML interests

None yet

Recent Activity

liked a model 3 days ago

onnx-community/LFM2-VL-450M-ONNX

liked a Space 4 months ago

huggingface/ai-deadlines

liked a Space 5 months ago

LPX55/Qwen-Image-Edit_Fast-Presets

View all activity

Organizations

liked a model 3 days ago

onnx-community/LFM2-VL-450M-ONNX

Image-Text-to-Text • Updated 3 days ago • 34 • 1

liked a Space 4 months ago

AI Deadlines

⚡

587

Organize project deadlines with ease

liked a Space 5 months ago

Qwen Image Edit Fast 4-Step (Multi+Presets)

⚡

4-Step Batch Image Editing with Qwen's Edit Model

liked a dataset 5 months ago

Junhoee/Logo-Dataset-Korean

Viewer • Updated Aug 16, 2024 • 21.2k • 63 • 3

reacted to sayakpaul's post with 🤗 5 months ago

Post

2037

Fast LoRA inference for Flux with Diffusers and PEFT 🚨

There are great materials that demonstrate how to optimize inference for popular image generation models, such as Flux. However, very few cover how to serve LoRAs fast, despite LoRAs being an inseparable part of their adoption.

In our latest post, @BenjaminB and I show different techniques to optimize LoRA inference for the Flux family of models for image generation. Our recipe includes the use of:

1. torch.compile
2. Flash Attention 3 (when compatible)
3. Dynamic FP8 weight quantization (when compatible)
4. Hotswapping for avoiding recompilation during swapping new LoRAs 🤯

We have tested our recipe with Flux.1-Dev on both H100 and RTX 4090. We achieve at least a *2x speedup* in either of the GPUs. We believe our recipe is grounded in the reality of how LoRA-based use cases are generally served. So, we hope this will be beneficial to the community 🤗

Even though our recipe was tested primarily with NVIDIA GPUs, it should also work with AMD GPUs.

Learn the details and the full code here:
https://huggingface.co/blog/lora-fast

3 replies

liked a Space 6 months ago

Self Forcing Wan 2.1

🎥

322

Real-time video generation

liked a Space 7 months ago

gradio_gradiodesigner

🚀

gradio designer

updated a Space 7 months ago

MCP Toolkit - Re-Thinking Deepfake Detection & Forensics

🚑

Analyze images to detect AI-generated content

liked 2 Spaces 7 months ago

RollingDepth

🛹

Video Depth without Video Models

Expressive TTS Arena

🎤

Vote for the most expressive TTS voice

reacted to yjernite's post with 🔥 7 months ago

Post

3428

Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on 🤗

HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data 📚🔍

That requires actually reading the code though, which isn't always easy or quick! Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen/Qwen2.5-Coder-32B-Instruct to generate reports and it works pretty OK 🙌

The app works in three stages:
1. Download all code files
2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1)
3. Summarize the app's main functionality and data journeys (screen 2)
4. Build a Privacy TLDR with those inputs

It comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints 🤗

Note that this is a POC, lots of exciting work to do to make it more robust, so:
- try it: yjernite/space-privacy
- reach out to collab: yjernite/space-privacy