aws-neuron (AWS Inferentia and Trainium)

dacorvo

updated a model about 2 hours ago

aws-neuron/optimum-neuron-cache

Updated about 2 hours ago • 18

jeffboudier

posted an update 8 days ago

Post

2412

👏 Congrats @jinanz adding TimesFM times series forecasting to Transformers!

Learn how to use TimesFM in this blog post by the Nutanix team: https://huggingface.co/blog/Nutanix/introducing-timesfm-for-time-series-forcasting

jeffboudier

posted an update 13 days ago

Post

469

Wrapping up a week of shipping and announcements with Dell Enterprise Hub now featuring AI Applications, on-device models for AI PCs, a new CLI and Python SDK... all you need for building AI on premises!

Blog post has all the details: https://huggingface.co/blog/dell-ai-applications

jeffboudier

posted an update 23 days ago

Post

2569

Transcribing 1 hour of audio for less than $0.01 🤯

@mfuntowicz cooked with 8x faster Whisper speech recognition - whisper-large-v3-turbo transcribes at 100x real time on a $0.80/hr L4 GPU!

How they did it: https://huggingface.co/blog/fast-whisper-endpoints

1-click deploy with HF Inference Endpoints: https://endpoints.huggingface.co/new?repository=openai%2Fwhisper-large-v3-turbo&vendor=aws&region=us-east&accelerator=gpu&instance_id=aws-us-east-1-nvidia-l4-x1&task=automatic-speech-recognition&no_suggested_compute=true

jeffboudier

posted an update 29 days ago

Post

3012

So many orgs on HF would really benefit from security and governance built into Enterprise Hub - I wrote a guide on why and how upgrade: https://huggingface.co/spaces/jeffboudier/how-to-upgrade-to-enterprise

For instance, did you know about Resource Groups?

pagezyhf

posted an update about 1 month ago

Post

1978

If you haven't had the chance to test the latest open model from Meta, Llama 4 Maverick, go try it on AMD MI 300 on Hugging Face!

amd/llama4-maverick-17b-128e-mi-amd

jeffboudier

posted an update 2 months ago

Post

2204

Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems 👉 dell.huggingface.co

jeffboudier

posted an update 2 months ago

Post

1565

Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing

2 replies

·

pagezyhf

posted an update 4 months ago

Post

1757

We published https://huggingface.co/blog/deepseek-r1-aws!

If you are using AWS, give a read. It is a running document to showcase how to deploy and fine-tune DeepSeek R1 models with Hugging Face on AWS.

We're working hard to enable all the scenarios, whether you want to deploy to Inference Endpoints, Sagemaker or EC2; with GPUs or with Trainium & Inferentia.

We have full support for the distilled models, DeepSeek-R1 support is coming soon!! I'll keep you posted.

Cheers

1 reply

·

pagezyhf

posted an update 5 months ago

Post

462

Learn how to deploy multiple LoRA adapters on Vertex AI with this blogpost, using Hugging Face Deep Learning Containers on GCP.

https://medium.com/google-cloud/open-models-on-vertex-ai-with-hugging-face-serving-multiple-lora-adapters-on-vertex-ai-e3ceae7b717c

jeffboudier

posted an update 5 months ago

Post

751

NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos

1 reply

·

pagezyhf

posted an update 6 months ago

Post

379

Today you are able to access some of the most famous models from the Hugging Face community in Amazon Bedrock 🤯

Amazon Bedrock expands its model catalog with Bedrock Marketplace to hundreds of specialized models.

https://us-west-2.console.aws.amazon.com/bedrock/home?region=us-west-2#/model-catalog

pagezyhf

posted an update 6 months ago

Post

984

It’s 2nd of December , here’s your Cyber Monday present 🎁 !

We’re cutting our price down on Hugging Face Inference Endpoints and Spaces!

Our folks at Google Cloud are treating us with a 40% price cut on GCP Nvidia A100 GPUs for the next 3️⃣ months. We have other reductions on all instances ranging from 20 to 50%.

Sounds like the time to give Inference Endpoints a try? Get started today and find in our documentation the full pricing details.
https://ui.endpoints.huggingface.co/
https://huggingface.co/pricing

pagezyhf

posted an update 6 months ago

Post

312

Hello Hugging Face Community,

if you use Google Kubernetes Engine to host you ML workloads, I think this series of videos is a great way to kickstart your journey of deploying LLMs, in less than 10 minutes! Thank you @wietse-venema-demo !

To watch in this order:
1. Learn what are Hugging Face Deep Learning Containers
https://youtu.be/aWMp_hUUa0c?si=t-LPRkRNfD3DDNfr

2. Learn how to deploy a LLM with our Deep Learning Container using Text Generation Inference
https://youtu.be/Q3oyTOU1TMc?si=V6Dv-U1jt1SR97fj

3. Learn how to scale your inference endpoint based on traffic
https://youtu.be/QjLZ5eteDds?si=nDIAirh1r6h2dQMD

If you want more of these small tutorials and have any theme in mind, let me know!

jeffboudier

posted an update 7 months ago

Post

1121

New - add your bluesky account to your HF profile:
https://huggingface.co/settings/profile

Is the grass greener, the sky bluer? Will try and figure it out at https://bsky.app/profile/jeffboudier.bsky.social

By the way, HF people starter pack https://bsky.app/starter-pack/huggingface.bsky.social/3laz5x7naiz22

pagezyhf

posted an update 7 months ago

Post

1374

Hello Hugging Face Community,

I'd like to share here a bit more about our Deep Learning Containers (DLCs) we built with Google Cloud, to transform the way you build AI with open models on this platform!

With pre-configured, optimized environments for PyTorch Training (GPU) and Inference (CPU/GPU), Text Generation Inference (GPU), and Text Embeddings Inference (CPU/GPU), the Hugging Face DLCs offer:

⚡ Optimized performance on Google Cloud's infrastructure, with TGI, TEI, and PyTorch acceleration.
🛠️ Hassle-free environment setup, no more dependency issues.
🔄 Seamless updates to the latest stable versions.
💼 Streamlined workflow, reducing dev and maintenance overheads.
🔒 Robust security features of Google Cloud.
☁️ Fine-tuned for optimal performance, integrated with GKE and Vertex AI.
📦 Community examples for easy experimentation and implementation.
🔜 TPU support for PyTorch Training/Inference and Text Generation Inference is coming soon!

Find the documentation at https://huggingface.co/docs/google-cloud/en/index
If you need support, open a conversation on the forum: https://discuss.huggingface.co/c/google-cloud/69

jeffboudier

posted an update 8 months ago

Post

1112

This week in Inference Endpoints - thx @erikkaum for the update!

👀 https://huggingface.co/blog/erikkaum/endpoints-changelog

1 reply

·

jeffboudier

posted an update 9 months ago

Post

474

Inference Endpoints got a bunch of cool updates yesterday, this is my top 3

jeffboudier

posted an update 9 months ago

Post

4114

Pro Tip - if you're a Firefox user, you can set up Hugging Chat as integrated AI Assistant, with contextual links to summarize or simplify any text - handy!

In this short video I show how to set it up

3 replies

·

jeffboudier

posted an update about 1 year ago

Post

1724

TGI v2.0.2 is out!
- New models (idefics2, phi3)
- Cleaner VLM support in the openai layer
- Upgraded to pytorch 2.3.0

https://github.com/huggingface/text-generation-inference/releases/tag/v2.0.2

Kudos @Narsil @olivierdehaene @drbh and so many contributors!

AWS Inferentia and Trainium

AI & ML interests

Recent Activity

aws-neuron's activity

aws-neuron/optimum-neuron-cache

AI & ML interests

Recent Activity

Team members 29

aws-neuron's activity