Hugging Face Optimum

non-profit

https://huggingface.co/docs/optimum

Activity Feed

AI & ML interests

Accelerating DL

Recent Activity

badaoui updated a Space 11 days ago

optimum/neuron-export

echarlaix updated a dataset 15 days ago

optimum/documentation-images

badaoui new activity about 1 month ago

optimum/neuron-export:Using neuron cache instead of creating a pull-request in the target repo

View all activity

pagezyhf

posted an update 4 days ago

Post

1540

In case you missed it, Hugging Face expanded its collaboration with Azure a few weeks ago with a curated catalog of 10,000 models, accessible from Azure AI Foundry and Azure ML!

@alvarobartt cooked during these last days to prepare the one and only documentation you need, if you wanted to deploy Hugging Face models on Azure. It comes with an FAQ, great guides and examples on how to deploy VLMs, LLMs, smolagents and more to come very soon.

We need your feedback: come help us and let us know what else you want to see, which model we should add to the collection, which model task we should prioritize adding, what else we should build a tutorial for. You’re just an issue away on our GitHub repo!

https://huggingface.co/docs/microsoft-azure/index

jeffboudier

posted an update 9 days ago

Post

352

AMD summer hackathons are here!
A chance to get hands-on with MI300X GPUs and accelerate models.
🇫🇷 Paris - Station F - July 5-6
🇮🇳 Mumbai - July 12-13
🇮🇳 Bengaluru - July 19-20

Hugging Face and GPU Mode will be on site and on July 6 in Paris @ror will share lessons learned while building new kernels to accelerate Llama 3.1 405B on ROCm

Register to Paris event: https://lu.ma/fmvdjmur?tk=KeAbiP
All dates: https://lu.ma/calendar/cal-3sxhD5FdxWsMDIz

pagezyhf

posted an update 10 days ago

Post

3156

Hackathons in Paris on July 5th and 6th!

Hugging Face just wrapped 4 months of deep work with AMD to push kernel-level optimization on their MI300X GPUs. Now, it's time to share everything we learned.

Join us in Paris at STATION F for a hands-on weekend of workshops and a hackathon focused on making open-source LLMs faster and more efficient on AMD.

Prizes, amazing host speakers, ... if you want more details, navigate to https://lu.ma/fmvdjmur!

2 replies

badaoui

updated a Space 11 days ago

Export to AWS Neuron

🏎

Export HF models to AWS Neuron-optimized format for Trn/Inf

echarlaix

updated a dataset 15 days ago

optimum/documentation-images

Viewer • Updated 15 days ago • 15 • 8.13k • 2

pagezyhf

posted an update 17 days ago

Post

2388

Webinar Alert

Build your first chatbot with a Hugging Face Spaces frontend and Gaudi-powered backend with @bconsolvo ! He will teach you how to build an LLM-powered chatbot using Streamlit and Hugging Face Spaces—integrating a model endpoint hosted on an Intel® Gaudi® accelerator.

Beginners are welcome

https://web.cvent.com/event/70e11f23-7c52-4994-a918-96fa9d5e935f/summary

1 reply

jeffboudier

posted an update 22 days ago

Post

1653

Today we launched Training Cluster as a Service, to make the new DGX Cloud Lepton supercloud easily accessible to AI researchers.

Hugging Face will collaborate with NVIDIA to provision and set up GPU training clusters to make them available for the duration of training runs.

Hugging Face organizations can sign up here: https://huggingface.co/training-cluster

badaoui

in optimum/neuron-export about 1 month ago

Using neuron cache instead of creating a pull-request in the target repo

❤️ 1

#1 opened about 1 month ago by

dacorvo

in optimum/neuron-export about 1 month ago

Using neuron cache instead of creating a pull-request in the target repo

❤️ 1

#1 opened about 1 month ago by

dacorvo

badaoui

published a Space about 1 month ago

Export to AWS Neuron

🏎

Export HF models to AWS Neuron-optimized format for Trn/Inf

badaoui

updated a dataset about 1 month ago

optimum/documentation-images

Viewer • Updated 15 days ago • 15 • 8.13k • 2

jeffboudier

posted an update about 1 month ago

Post

2454

👏 Congrats @jinanz adding TimesFM times series forecasting to Transformers!

Learn how to use TimesFM in this blog post by the Nutanix team: https://huggingface.co/blog/Nutanix/introducing-timesfm-for-time-series-forcasting

jeffboudier

posted an update about 1 month ago

Post

490

Wrapping up a week of shipping and announcements with Dell Enterprise Hub now featuring AI Applications, on-device models for AI PCs, a new CLI and Python SDK... all you need for building AI on premises!

Blog post has all the details: https://huggingface.co/blog/dell-ai-applications

regisss

posted an update about 2 months ago

Post

2389

It will be very interesting to benchmark how energy-efficient the new Falcon-Edge is: https://huggingface.co/blog/tiiuae/falcon-edge

Should be super efficient on CPU according to the numbers published for BitNet: https://github.com/microsoft/BitNet/blob/main/assets/intel_performance.jpg

1 reply

jeffboudier

posted an update about 2 months ago

Post

2590

Transcribing 1 hour of audio for less than $0.01 🤯

@mfuntowicz cooked with 8x faster Whisper speech recognition - whisper-large-v3-turbo transcribes at 100x real time on a $0.80/hr L4 GPU!

How they did it: https://huggingface.co/blog/fast-whisper-endpoints

1-click deploy with HF Inference Endpoints: https://endpoints.huggingface.co/new?repository=openai%2Fwhisper-large-v3-turbo&vendor=aws&region=us-east&accelerator=gpu&instance_id=aws-us-east-1-nvidia-l4-x1&task=automatic-speech-recognition&no_suggested_compute=true

jeffboudier

posted an update about 2 months ago

Post

3020

So many orgs on HF would really benefit from security and governance built into Enterprise Hub - I wrote a guide on why and how upgrade: https://huggingface.co/spaces/jeffboudier/how-to-upgrade-to-enterprise

For instance, did you know about Resource Groups?

pagezyhf

posted an update 2 months ago

Post

1995

If you haven't had the chance to test the latest open model from Meta, Llama 4 Maverick, go try it on AMD MI 300 on Hugging Face!

amd/llama4-maverick-17b-128e-mi-amd

jeffboudier

posted an update 3 months ago

Post

2208

Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems 👉 dell.huggingface.co

jeffboudier

posted an update 3 months ago

Post

1572

Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing

2 replies

regisss

posted an update 5 months ago

Post

1758

Nice paper comparing the fp8 inference efficiency of Nvidia H100 and Intel Gaudi2: An Investigation of FP8 Across Accelerators for LLM Inference (2502.01070)

The conclusion is interesting: "Our findings highlight that the Gaudi 2, by leveraging FP8, achieves higher throughput-to-power efficiency during LLM inference"

One aspect of AI hardware accelerators that is often overlooked is how they consume less energy than GPUs. It's nice to see researchers starting carrying out experiments to measure this!

Gaudi3 results soon...

AI & ML interests

Recent Activity

Team members 16

optimum's activity

Export to AWS Neuron

Using neuron cache instead of creating a pull-request in the target repo

Using neuron cache instead of creating a pull-request in the target repo

Export to AWS Neuron