PS: note that you also probably need to push a dummy / empty commit in order to trigger a spaces
package update
Charles Bensimon
AI & ML interests
Recent Activity
Organizations
cbensimon's activity


medium
size is now available as a power-user featureNothing too fancy for now—ZeroGPU Spaces still default to
large
(70GB VRAM)—but this paves the way for:- 💰 size-based quotas / pricing (
medium
will offer significantly more usage than large
)- 🦣 the upcoming
xlarge
size (141GB VRAM)You can as of now control GPU size via a Space variable. Accepted values:
-
auto
(future default)-
medium
-
large
(current default)The auto mode checks total CUDA tensor size during startup:
- More than 30GB →
large
- Otherwise →
medium
Hi everyone, thanks for your patience.
We’ve just increased the quotas for all the long-time Pro users that have been impacted by the recent change (including Pro users who engaged with this post). You can look at your new quotas
Hi @Keltezaa . We had to change how ZeroGPU quotas work. We have noted that some Pro users have been negatively impacted by this change. Keeping you updated in the next couple days. Thank you for your patience
Hi @sebblers , this should not be the case.
Could you try running an image generation on this Space:
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
(make sure to be logged-in to your HuggingFace account)
Sorry for the inconvenience
Hi @Keltezaa ,
By my rough calculation the current recovery rate for GPU time spend is 18min per every 60sec of GPU usage
You are not very far from reality. Actually, you get back half of your consumed quotas every 5h, which means that if you completely use your 25 minutes of quota, you'll get 12.5 minutes back after 5 hours (not all at once but progressively, in a logarithmic fashion). This gives a bit more than 60s every 18min when your quotas are empty. In the end, if used at its maximum, we end up with up to 30 hours of GPU per month
The second thing that bothers me a bit is that some Errors, or failed image generation do not refund the usage. So if an image fails due to whatever error. It still gets added to the usage and as mentioned before recovers very slow
I understand your concern but ZeroGPU does not guarantee execution results (as opposed to inference API products).
It is rather a cloud runtime that supports running arbitrary / user-defined (CUDA) applications.
Subscribing to Pro allows you to create such apps, as well as using them (yours or others) a lot more than free users
Hoping that it helps clarifying things.
Let me know if you have further concerns

We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.
Major improvements:
1. GPU cold starts about twice as fast!
2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use
progress=gr.Progress(track_tqdm=True)
)4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!
Feel free to answer in the post if you have any questions
🤗 Best regards,
Charles

Image models and LoRAs now have little previews 🤏
If you don't know where to start to find them, I invite you to browse cool LoRAs in the profile of some amazing fine-tuners: @artificialguybr , @alvdansen , @DoctorDiffusion , @e-n-v-y , @KappaNeuro @ostris

👉 Model: fluently/Fluently-XL-v4
✨ Playground: fluently/Fluently-Playground

huggingface_hub
Python library!Exciting updates include:
✨ Chat-completion API in the InferenceClient!
🤖 Official inference types in InferenceClient!
🧩 Better config and tags in
ModelHubMixin
!🏆 Generate model cards for your
ModelHubMixin
integrations! 🏎️ x3 download speed in
HfFileSystem
!!Check out the full release notes for more details: Wauplin/huggingface_hub#5 👀
You are now accepted in the beta @nxphi47 and @CharlieLiveh, enjoy!

Well, yes, if the models are somewhat compatible. Here is an experiment I did. I wanted to merge two of the best performing models: mlabonne/NeuralBeagle14-7B and jeonsworld/CarbonVillain-en-10.7B-v4
Here is my recipe:
1. Expand the layers of NeuralBeagle to 10.7B ala frankenmerge.
2. DPO-tune the previous model with a high-quality preference dataset, argilla/distilabel-intel-orca-dpo-pairs
3. Merge the previous model with CarbonVillain (needs —allow-crimes in mergekit! 🔪)
And here is the resulting model, CarbonBeagle-11B, which ranked top in the leaderboard for its size class:
vicgalle/CarbonBeagle-11B
Hi @koisose, no special requirement we simply quickly check your HF profile and then accept you (I will make sure that we don't miss your profile). Don't hesitate to read https://huggingface.co/zero-gpu-explorers organization card that currently acts as (the only) documentation for ZeroGPU


From Argilla, we recently fine-tuned Mixtral 8x7b Instruct from Mistral AI using DPO, and a binarized and curated version of UltraFeedback, to find out it outperforms every other MoE-based model on the Hub.
- argilla/notux-8x7b-v1
- argilla/ultrafeedback-binarized-preferences-cleaned