AI & ML interests

None defined yet.

SpacesExamples's activity

jsulzย 
posted an update 6 days ago
view post
Post
718
As xet-team infrastructure begins backing hundreds of repositories on the Hugging Face Hub, weโ€™re getting to put on our researcher hats and peer into the bytes. ๐Ÿ‘€ ๐Ÿค“

IMO, one of the most interesting ideas Xet storage introduces is a globally shared store of data.

When you upload a file through Xet, the contents are split into ~64KB chunks and deduplicated, but what if those same chunks already exist in another repo on the Hub?

If we can detect and reuse them, we skip them as well saving time and bandwidth for AI builders. More on how that works here:
๐Ÿ”— https://huggingface.co/blog/from-chunks-to-blocks#scaling-deduplication-with-aggregation

Because of this, different repositories can share bytes we store. That opens up something cool - we can draw a graph of which repos actually share data at the chunk level, where:

- Nodes = repositories
- Edges = shared chunks
- Edge thickness = how much they overlap

xet-team/repo-graph

Come find the many BERT islands. Or see how datasets relate in practice, not just in theory. See how libraries or tasks can tie repositories together. You can play around with node size using storage/likes/downloads too.

The result is a super fun visualization from @saba9 and @znation that Iโ€™ve already lost way too much time to. I'm excited to see how the networks grow as we add more repositories!
jsulzย 
posted an update 7 days ago
view post
Post
2857
What does it mean when models share the same bytes?

We've investigated some quants and have seen that a considerable portion of quantizations of the same model share the same bytes and can be deduplicated to save considerable upload time for quantizers on the Hub.

This space where we crack open a repo from @bartowski shows we can get significant dedupe xet-team/quantization-dedup

You can get a sense of why by reading this write-up: https://github.com/bartowski1182/llm-knowledge/blob/main/quantization/quantization.md

But what about finetuned models?

Since going into production the xet-team has migrated hundreds of repositories on the Hub to our storage layer, including classic "pre-Hub" open-source models like FacebookAI/xlm-roberta-large (XLM-R) from FacebookAI

XLM-R, introduced in 2019, set new benchmarks for multilingual NLP by learning shared representations across 100 languages. It was then fine-tuned on English, Spanish, Dutch, and German, generating language-specific derivations for each - check out the paper here Unsupervised Cross-lingual Representation Learning at Scale (1911.02116)

These finetunes share much of the same architecture and layout as XLM-R with similar training methods and goals. It makes sense that they would share bytes, but it's still fascinating to see.

We put together a similar space to explore these models to see where they overlap - check it out for yourself xet-team/finetune-dedupe

The darker each block in the heatmap, the more the bytes are shared. Clicking on a repos blocks shows all other repos that share blocks.
  • 1 reply
ยท
jsulzย 
posted an update 8 days ago
view post
Post
2014
The Llama 4 release - meta-llama/llama-4-67f0c30d9fe03840bc9d0164 - was a big one for the xet-team with every model backed by the storage infrastructure of the future for the Hub.

It's been a wild few days, and especially ๐Ÿคฏ to see every tensor file with a Xet logo next to it instead of LFS.

The attached graph shows requests per second to our content-addressed store (CAS) right as the release went live.

yellow = GETs; dashed line = launch time.

You can definitely tell when the community started downloading ๐Ÿ‘€

h/t to @rajatarya for the graph, the entire Xet crew to bring us to this point, and special shoutout to Rajat, @port8080 , @brianronan , @seanses , and @znation who made sure the bytes kept flying all weekend โšก๏ธ
  • 1 reply
ยท
jsulzย 
posted an update 10 days ago
view post
Post
3580
Huge week for xet-team as Llama 4 is the first major model on Hugging Face uploaded with Xet providing the backing! Every byte downloaded comes through our infrastructure.

Using Xet on Hugging Face is the fastest way to download and iterate on open source models and we've proved it with Llama 4 giving a boost of ~25% across all models.

We expect builders on the Hub to see even more improvements, helping power innovation across the community.

With the models on our infrastructure, we can peer in and see how well our dedupe performs across the Llama 4 family. On average, we're seeing ~25% dedupe, providing huge savings to the community who iterate on these state-of-the-art models. The attached image shows a few selected models and how they perform on Xet.

Thanks to the meta-llama team for launching on Xet!
jsulzย 
posted an update 28 days ago
view post
Post
1938
If you've been following along with the Xet Team's ( xet-team ) work, you know we've been working to migrate the Hugging Face Hub from Git LFS and to Xet.

Recently, we launched a waitlist to join the movement to Xet (join here! https://huggingface.co/join/xet ) but getting to this point was a journey.

From the initial proof of concept in August, to launching on the Hub internally, to migrating a set of repositories and routing a small chunk of download traffic on the Hub through our infrastructure. Every step of the way has been full of challenges, big and small, and well worth the effort.

Over the past few weeks, with real traffic flowing through our services weโ€™ve tackled some truly gnarly issues (unusual upload/download patterns, memory leaks, load imbalances, and more) and resolved each without major disruptions.

If you're curious about how this sliver of Hub infrastructure looks as we routed traffic through it for the first time (and want a deep dive full of Grafana and Kibana charts ๐Ÿค“) I have a post for you.

Here's an inside look into the day of our first migrations and the weeks following, where we pieced together solutions in real time.

https://huggingface.co/blog/xet-on-the-hub
jsulzย 
posted an update about 1 month ago
view post
Post
1434
It's finally here โค๏ธ

Build faster than ever with lightning fast upload and download speeds starting today on the Hub โšก

Xet storage is rolling out access across the Hub - join the waitlist here https://huggingface.co/join/xet

You can apply for yourself, or your entire organization. Head over to your account settings for more information or join anywhere you see the Xet logo on a repository you know.

Have questions? Join the conversation below ๐Ÿ‘‡ or open a discussion on the Xet team page xet-team/README
ยท
jsulzย 
posted an update about 1 month ago
view post
Post
2107
If you haven't already, I strong recommend reading @chiphuyen 's AI Engineering https://www.goodreads.com/en/book/show/216848047-ai-engineering

It comes complete with a section on open source AI (of obvious interest to the crowd here) and more than one mention of the Hugging Face community ๐Ÿค—

In my opinion, one of the best parts is that it is a compendium for seminal and cutting-edge AI resources, with nearly 250 arXiv papers cited. I've done my best to collect them all in a single place, organized by chapter and by order in which they appear in the book:
jsulz/ai-engineering-67c5abe02c8596b5c089934c

Happy reading ๐Ÿค“
lysandreย 
posted an update about 2 months ago
view post
Post
6276
SmolVLM-2 and SigLIP-2 are now part of transformers in dedicated releases!

They're added on top of the v4.49.0 release, and can be installed from the following tags: v4.49.0-SmolVLM-2 and v4.49.0-SigLIP-2.

This marks a new beginning for the release process of transformers. For the past five years, we've been doing monthly releases featuring many models (v4.49.0, the latest release, features 9 new architectures).

Starting with SmolVLM-2 & SigLIP2, we'll now additionally release tags supporting new models on a stable branch. These models are therefore directly available for use by installing from the tag itself. These tags will continue to be updated with fixes applied to these models.

Going forward, continue expecting software releases following semantic versioning: v4.50.0 will have ~10 new architectures compared to v4.49.0, as well as a myriad of new features, improvements and bug fixes. Accompanying these software releases, we'll release tags offering brand new models as fast as possible, to make them accessible to all immediately.
  • 1 reply
ยท
jsulzย 
posted an update about 2 months ago
view post
Post
3572
Time flies!

Six months after joining Hugging Face the Xet team is kicking off the first migrations from LFS to our storage for a number of repositories on the Hub.

More on the nitty gritty details behind the migration soon, but here are the big takeaways:

๐Ÿค– We've successfully completed the first migrations from LFS -> Xet to test the infrastructure and prepare for a wider release

โœ… No action on your part needed - you can work with a Xet-backed repo like any other repo on the Hub (for now - major improvements on their way!)

๐Ÿ‘€ Keep an eye out for the Xet logo to see if a repo you know is on our infra! See the screenshots below to spot the difference ๐Ÿ‘‡

โฉ โฉ โฉ Blazing uploads and downloads coming soon. Wโ€™re gearing up for a full integration with the Hub's Python library that will make building on the Hub faster than ever - special thanks to @celinah and @Wauplin for their assistance.

๐ŸŽ‰ Want Early Access? If youโ€™re curious and want to test it out the bleeding edge that will power the development experience on the Hub, weโ€™d love to partner with you. Let me know!

This is the culmination of a lot of effort from the entire team. Big round of applause to @sirahd @brianronan @jgodlewski @hoytak @seanses @assafvayner @znation @saba9 @rajatarya @port8080 @yuchenglow
  • 1 reply
ยท
jsulzย 
posted an update 2 months ago
view post
Post
3101
Toward the end of last year, the Xet team provided an inside look into the foundations of how we plan to enable rapid experimentation and iteration for the AI builders on the Hub: https://huggingface.co/blog/from-files-to-chunks

But it turns out chunks aren't all you need!

Our goal is to bring:
๐Ÿš€ Faster uploads
โฌ Speedy downloads
๐Ÿ’ช All without sacrificing your workflow

To do that, we need the infrastructure and system and design to back it up. As we prepare to roll out the first Xet-backed repositories on the Hub, we wrote up a post explaining the nitty gritty details of the decisions that bring this to life https://huggingface.co/blog/from-chunks-to-blocks

Complete with an interactive visualization that shows the power of deduplication in action - taking a 191GB repo to ~97GB and shaving a few hours off upload speeds.

The darker each block in the heatmap, the more we dedupe, the less we have to transfer. Clicking on a file's blocks shows all other files that share blocks.

Check it out and explore for yourself! xet-team/quantization-dedup
victorย 
posted an update 2 months ago
view post
Post
5477
Hey everyone, we've given https://hf.co/spaces page a fresh update!

Smart Search: Now just type what you want to doโ€”like "make a viral meme" or "generate music"โ€”and our search gets it.

New Categories: Check out the cool new filter bar with icons to help you pick a category fast.

Redesigned Space Cards: Reworked a bit to really show off the app descriptions, so you know what each Space does at a glance.

Random Prompt: Need ideas? Hit the dice button for a burst of inspiration.

Weโ€™d love to hear what you thinkโ€”drop us some feedback plz!
ยท
victorย 
posted an update 3 months ago
view post
Post
3109
Finally, an open-source AI that turns your lyrics into full songs is hereโ€”meet YuE! Unlike other tools that only create short clips, YuE can make entire songs (up to 5 minutes) with vocals, melody, and instruments all working together. Letsss go!

m-a-p/YuE-s1-7B-anneal-en-cot
jbilcke-hfย 
posted an update 4 months ago
view post
Post
12270
Doing some testing with HunyuanVideo on the Hugging Face Inference Endpoints ๐Ÿค—

prompt: "a Shiba Inu is acting as a DJ, he wears sunglasses and is mixing and scratching with vinyl discs at a Ibiza sunny sand beach party"

1280x720, 22 steps, 121 frames

There are still some things to iron out regarding speed and memory usage, right now it takes 20min on an A100 (see attached charts)

but you can check it out here:

https://huggingface.co/jbilcke-hf/HunyuanVideo-for-InferenceEndpoints

There are various things I want to try like the 100% diffusers version and other models (LTX-Video..)
jsulzย 
posted an update 4 months ago
view post
Post
1437
Doing a lot of benchmarking and visualization work, which means I'm always searching for interesting repos in terms of file types, size, branches, and overall structure.

To help, I built a Space jsulz/repo-info that lets you search for any repo and get back:

- Treemap of the repository, color coded by file/directory size
- Repo branches and their size
- Cumulative size of different file types (e.g., the total size of all the safetensors in the repo)

And because I'm interested in how this will fit in our work to leverage content-defined chunking for versioning repos on the Hub
- https://huggingface.co/blog/from-files-to-chunks - everything has the number of chunks (1 chunk = 64KB) as well as the total size in bytes.

Some of the treemaps are pretty cool. Attached are black-forest-labs/FLUX.1-dev and for fun laion/laion-audio-preview (which has nearly 10k .tar files ๐Ÿคฏ)

  • 2 replies
ยท
victorย 
posted an update 5 months ago
view post
Post
2219
Qwen/QwQ-32B-Preview shows us the future (and it's going to be exciting)...

I tested it against some really challenging reasoning prompts and the results are amazing ๐Ÿคฏ.

Check this dataset for the results: victor/qwq-misguided-attention
  • 2 replies
ยท
jsulzย 
posted an update 5 months ago
view post
Post
1601
Something I love about working at Hugging Face is the opportunity to design and work in public. Right now, weโ€™re redesigning the architecture that supports uploads and downloads on the Hub.

Datasets and models are growing fast, and so are the challenges of storing and transferring them efficiently. To keep up, we're introducing a new protocol for uploads and downloads, supported by a content-addressed store (CAS).

Hereโ€™s whatโ€™s coming:

๐Ÿ“ฆ Smarter uploads: Chunk-level management enables advanced deduplication, compression, and reduces redundant transfers, speeding up uploads.
โšก Efficient downloads: High throughput and low latency ensure fast access, even during high-demand model releases.
๐Ÿ”’ Enhanced security: Validate uploads before storage to block malicious or invalid data.

We analyzed 24 hours of global upload activity in October (88 countries, 130TB of data!) to design a system that scales with your needs.

The result? A proposed infrastructure with CAS nodes in us-east-1, eu-west-3, and ap-southeast-1.

๐Ÿ”— Read the blog post for the full details: https://huggingface.co/blog/rearchitecting-uploads-and-downloads

๐ŸŒŸ Check out our interactive demo to explore the data yourself!
xet-team/cas-analysis

Weโ€™d love to hear your feedback - let us know if you have questions or want to see more.
ยท
victorย 
posted an update 5 months ago
view post
Post
2608
Perfect example of why Qwen/Qwen2.5-Coder-32B-Instruct is insane?

Introducing: AI Video Composer ๐Ÿ”ฅ
huggingface-projects/ai-video-composer

Drag and drop your assets (images/videos/audios) to create any video you want using natural language!

It works by asking the model to output a valid FFMPEG and this can be quite complex but most of the time Qwen2.5-Coder-32B gets it right (that thing is a beast). It's an update of an old project made with GPT4 and it was almost impossible to make it work with open models back then (~1.5 years ago), but not anymore, let's go open weights ๐Ÿš€.
victorย 
posted an update 5 months ago
view post
Post
1853
Qwen2.5-72B is now the default HuggingChat model.
This model is so good that you must try it! I often get better results on rephrasing with it than Sonnet or GPT-4!!
jsulzย 
posted an update 5 months ago
view post
Post
2961
When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. Thatโ€™s where our chunk-based approach comes in.

Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:

โฉ Only upload the chunks that changed.
๐Ÿš€ Download just the updates, not the whole file.
๐Ÿง  We store your file as deduplicated chunks

In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isnโ€™t just a performance boost. Itโ€™s a rethinking of how we manage models and datasets on the Hub.

We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?

https://huggingface.co/blog/from-files-to-chunks