cbensimon (Charles Bensimon)

replied to their post 15 days ago

It makes sense and that's noted @John6666

replied to their post 16 days ago

cc @sayakpaul for visibility

posted an update 16 days ago

Post

3090

🚀 ZeroGPU now supports PyTorch native quantization via torchao

While it hasn’t been battle-tested yet, Int8WeightOnlyConfig is already working flawlessly in our tests.

Let us know if you run into any issues — and we’re excited to see what the community will build!

import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_

pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)

@spaces.GPU
def generate(prompt: str):
    return pipeline(prompt).images[0]

5 replies

·

replied to their post about 1 month ago

PS: note that you also probably need to push a dummy / empty commit in order to trigger a spaces package update

posted an update about 1 month ago

Post

5817

🚀 ZeroGPU medium size is now available as a power-user feature

Nothing too fancy for now—ZeroGPU Spaces still default to large (70GB VRAM)—but this paves the way for:
- 💰 size-based quotas / pricing (medium will offer significantly more usage than large)
- 🦣 the upcoming xlarge size (141GB VRAM)

You can as of now control GPU size via a Space variable. Accepted values:
- auto (future default)
- medium
- large (current default)

The auto mode checks total CUDA tensor size during startup:
- More than 30GB → large
- Otherwise → medium

3 replies

·

replied to Keltezaa's post 3 months ago

Hi everyone, thanks for your patience.
We’ve just increased the quotas for all the long-time Pro users that have been impacted by the recent change (including Pro users who engaged with this post). You can look at your new quotas

replied to Keltezaa's post 3 months ago

Hi @Keltezaa . We had to change how ZeroGPU quotas work. We have noted that some Pro users have been negatively impacted by this change. Keeping you updated in the next couple days. Thank you for your patience

replied to sebblers's post 5 months ago

Hi @sebblers , this should not be the case.

Could you try running an image generation on this Space:
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell

(make sure to be logged-in to your HuggingFace account)

Sorry for the inconvenience

replied to Keltezaa's post 5 months ago

Hi @Keltezaa ,

By my rough calculation the current recovery rate for GPU time spend is 18min per every 60sec of GPU usage

You are not very far from reality. Actually, you get back half of your consumed quotas every 5h, which means that if you completely use your 25 minutes of quota, you'll get 12.5 minutes back after 5 hours (not all at once but progressively, in a logarithmic fashion). This gives a bit more than 60s every 18min when your quotas are empty. In the end, if used at its maximum, we end up with up to 30 hours of GPU per month

The second thing that bothers me a bit is that some Errors, or failed image generation do not refund the usage. So if an image fails due to whatever error. It still gets added to the usage and as mentioned before recovers very slow

I understand your concern but ZeroGPU does not guarantee execution results (as opposed to inference API products).
It is rather a cloud runtime that supports running arbitrary / user-defined (CUDA) applications.
Subscribing to Pro allows you to create such apps, as well as using them (yours or others) a lot more than free users

Hoping that it helps clarifying things.
Let me know if you have further concerns

posted an update 10 months ago

Post

4695

Hello everybody,

We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.

Major improvements:

1. GPU cold starts about twice as fast!
2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))
4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!

Feel free to answer in the post if you have any questions

🤗 Best regards,
Charles

reacted to multimodalart's post with ❤️ 10 months ago

Post

35202

New feature 🔥
Image models and LoRAs now have little previews 🤏

If you don't know where to start to find them, I invite you to browse cool LoRAs in the profile of some amazing fine-tuners: @artificialguybr , @alvdansen , @DoctorDiffusion , @e-n-v-y , @KappaNeuro @ostris

3 replies

·

reacted to ehristoforu's post with 🔥👍 about 1 year ago

Post

1929

I decided to play around with FluentlyXL v4 😉

👉 Model: fluently/Fluently-XL-v4
✨ Playground: fluently/Fluently-Playground

reacted to Wauplin's post with 🔥 over 1 year ago

Post

2341

🚀 Just released version 0.22.0 of the huggingface_hub Python library!

Exciting updates include:
✨ Chat-completion API in the InferenceClient!
🤖 Official inference types in InferenceClient!
🧩 Better config and tags in ModelHubMixin!
🏆 Generate model cards for your ModelHubMixin integrations!
🏎️ x3 download speed in HfFileSystem!!

Check out the full release notes for more details: Wauplin/huggingface_hub#5 👀

2 replies

·

replied to merve's post over 1 year ago

You are now accepted in the beta @nxphi47 and @CharlieLiveh, enjoy!

reacted to merve's post with 🤗👍 over 1 year ago

Post

Migrated all my GPU consuming Spaces to ZERO, it was super easy to do so (add three lines of code and voila!) and the start-up time decreased dramatically as well 💜

17 replies

·

reacted to vicgalle's post with 🤯 over 1 year ago

Post

Can you merge models of different sizes? ⚗️

Well, yes, if the models are somewhat compatible. Here is an experiment I did. I wanted to merge two of the best performing models: mlabonne/NeuralBeagle14-7B and jeonsworld/CarbonVillain-en-10.7B-v4

Here is my recipe:
1. Expand the layers of NeuralBeagle to 10.7B ala frankenmerge.
2. DPO-tune the previous model with a high-quality preference dataset, argilla/distilabel-intel-orca-dpo-pairs
3. Merge the previous model with CarbonVillain (needs —allow-crimes in mergekit! 🔪)

And here is the resulting model, CarbonBeagle-11B, which ranked top in the leaderboard for its size class:
vicgalle/CarbonBeagle-11B

2 replies

·

replied to merve's post over 1 year ago

Hi @koisose, no special requirement we simply quickly check your HF profile and then accept you (I will make sure that we don't miss your profile). Don't hesitate to read https://huggingface.co/zero-gpu-explorers organization card that currently acts as (the only) documentation for ZeroGPU

reacted to mrfakename's post with 👍 over 1 year ago

Post

This is my first post! Thanks to @victor for adding me!

1 reply

·

Charles Bensimon

AI & ML interests

Recent Activity

Organizations

Charles Bensimon

AI & ML interests

Recent Activity

Organizations

cbensimon's activity