Charles Bensimon's picture

Charles Bensimon

cbensimon

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face's profile picture Robustness Gym's profile picture Spaces-explorers's profile picture The Team Ten's profile picture Blog-explorers's profile picture TTS Eval (OLD)'s profile picture ZeroGPU Explorers's profile picture TTS AGI's profile picture Social Post Explorers's profile picture zero gpu hacking's profile picture

cbensimon's activity

replied to their post 19 days ago
view reply

PS: note that you also probably need to push a dummy / empty commit in order to trigger a spaces package update

posted an update 19 days ago
view post
Post
5714
🚀 ZeroGPU medium size is now available as a power-user feature

Nothing too fancy for now—ZeroGPU Spaces still default to large (70GB VRAM)—but this paves the way for:
- 💰 size-based quotas / pricing (medium will offer significantly more usage than large)
- 🦣 the upcoming xlarge size (141GB VRAM)

You can as of now control GPU size via a Space variable. Accepted values:
- auto (future default)
- medium
- large (current default)

The auto mode checks total CUDA tensor size during startup:
- More than 30GB → large
- Otherwise → medium
·
replied to Keltezaa's post about 2 months ago
view reply

Hi everyone, thanks for your patience.
We’ve just increased the quotas for all the long-time Pro users that have been impacted by the recent change (including Pro users who engaged with this post). You can look at your new quotas

replied to Keltezaa's post 2 months ago
view reply

Hi @Keltezaa . We had to change how ZeroGPU quotas work. We have noted that some Pro users have been negatively impacted by this change. Keeping you updated in the next couple days. Thank you for your patience

replied to sebblers's post 4 months ago
replied to Keltezaa's post 4 months ago
view reply

Hi @Keltezaa ,

By my rough calculation the current recovery rate for GPU time spend is 18min per every 60sec of GPU usage

You are not very far from reality. Actually, you get back half of your consumed quotas every 5h, which means that if you completely use your 25 minutes of quota, you'll get 12.5 minutes back after 5 hours (not all at once but progressively, in a logarithmic fashion). This gives a bit more than 60s every 18min when your quotas are empty. In the end, if used at its maximum, we end up with up to 30 hours of GPU per month

The second thing that bothers me a bit is that some Errors, or failed image generation do not refund the usage. So if an image fails due to whatever error. It still gets added to the usage and as mentioned before recovers very slow

I understand your concern but ZeroGPU does not guarantee execution results (as opposed to inference API products).
It is rather a cloud runtime that supports running arbitrary / user-defined (CUDA) applications.
Subscribing to Pro allows you to create such apps, as well as using them (yours or others) a lot more than free users

Hoping that it helps clarifying things.
Let me know if you have further concerns

posted an update 9 months ago
view post
Post
4683
Hello everybody,

We've rolled out a major update to ZeroGPU! All the Spaces are now running on it.

Major improvements:

1. GPU cold starts about twice as fast!
2. RAM usage reduced by two-thirds, allowing more effective resource usage, meaning more GPUs for the community!
3. ZeroGPU initializations (coldstarts) can now be tracked and displayed (use progress=gr.Progress(track_tqdm=True))
4. Improved compatibility and PyTorch integration, increasing ZeroGPU compatible spaces without requiring any modifications!

Feel free to answer in the post if you have any questions

🤗 Best regards,
Charles
reacted to multimodalart's post with ❤️ 9 months ago
reacted to ehristoforu's post with 🔥👍 12 months ago
reacted to Wauplin's post with 🔥 about 1 year ago
view post
Post
2338
🚀 Just released version 0.22.0 of the huggingface_hub Python library!

Exciting updates include:
✨ Chat-completion API in the InferenceClient!
🤖 Official inference types in InferenceClient!
🧩 Better config and tags in ModelHubMixin!
🏆 Generate model cards for your ModelHubMixin integrations!
🏎️ x3 download speed in HfFileSystem!!

Check out the full release notes for more details: Wauplin/huggingface_hub#5 👀
  • 2 replies
·
replied to merve's post about 1 year ago
reacted to merve's post with 🤗👍 over 1 year ago
view post
Post
Migrated all my GPU consuming Spaces to ZERO, it was super easy to do so (add three lines of code and voila!) and the start-up time decreased dramatically as well 💜
·
reacted to vicgalle's post with 🤯 over 1 year ago
view post
Post
Can you merge models of different sizes? ⚗️

Well, yes, if the models are somewhat compatible. Here is an experiment I did. I wanted to merge two of the best performing models: mlabonne/NeuralBeagle14-7B and jeonsworld/CarbonVillain-en-10.7B-v4

Here is my recipe:
1. Expand the layers of NeuralBeagle to 10.7B ala frankenmerge.
2. DPO-tune the previous model with a high-quality preference dataset, argilla/distilabel-intel-orca-dpo-pairs
3. Merge the previous model with CarbonVillain (needs —allow-crimes in mergekit! 🔪)

And here is the resulting model, CarbonBeagle-11B, which ranked top in the leaderboard for its size class:
vicgalle/CarbonBeagle-11B
  • 2 replies
·
replied to merve's post over 1 year ago
reacted to mrfakename's post with 👍 over 1 year ago
view post
Post
This is my first post! Thanks to @victor for adding me!
  • 1 reply
·
reacted to alvarobartt's post with ❤️ over 1 year ago