So many orgs on HF would really benefit from security and governance built into Enterprise Hub - I wrote a guide on why and how upgrade: jeffboudier/how-to-upgrade-to-enterprise
What are you using to evaluate models or AI systems? So far we're building lighteval & leaderboards on the hub but still feels early & a lot more to build. What would be useful to you?
FramePack is hands down one of the best OS releases in video generation 🙇🏻♀️🤯 ✅ fully open sourced + amazing quality + reduced memory + improved speed but more even - its gonna facilitate *soooo* many downstream applications like this version adapted for landscape rotation 👇https://huggingface.co/spaces/tori29umai/FramePack_rotate_landscape
The meta-llama org just crossed 40,000 followers on Hugging Face. Grateful for all their impact on the field sharing the Llama weights openly and much more!
We need more of this from all other big tech to make the AI more open, collaborative and beneficial to all!
At xet-team we've been hard at work bringing a new generation of storage to the Hugging Face community, and we’ve crossed some major milestones:
👷 Over 2,000 builders and nearing 100 organizations with access to Xet 🚀 Over 70,000 model and dataset repositories are Xet-backed 🤯 1.4 petabytes managed by Xet
As we move repos from LFS to Xet for everyone we onboard, we’re pushing our content-addressed store (CAS). Check out the chart below 👇 of CAS hitting up to 150 Gb/s throughput this past week.
All of this growth is helping us build richer insights. We expanded our repo graph, which maps how Xet-backed repositories on the Hub share bytes with each other.
Check out the current network in the image below (nodes are repositories, edges are where repos share bytes) and visit the space to see how different versions of Qwen, Llama, and Phi models are grouped together xet-team/repo-graph
Introducing the ONNX model explorer: Browse, search, and visualize neural networks directly in your browser. 🤯 A great tool for anyone studying Machine Learning! We're also releasing the entire dataset of graphs so you can use them in your own projects! 🤗
The dataset distils reasoning chains from arXiv research papers in biology and economics. Some nice features of the dataset:
- Extracts both the logical structure AND researcher intuition from academic papers - Adopts the persona of researchers "before experiments" to capture exploratory thinking - Provides multi-short and single-long reasoning formats with token budgets - Shows 7.2% improvement on MMLU-Pro Economics when fine-tuning a 3B model
It's created using the Curator framework with plans to scale across more scientific domains and incorporate multi-modal reasoning with charts and mathematics.
I personally am very excited about datasets like this, which involve creativity in their creation and don't just rely on $$$ to produce a big dataset with little novelty.
Energy is a massive constraint for AI but do you even know what energy your chatGPT convos are using?
We're trying to change this by releasing ChatUI-energy, the first interface where you see in real-time what energy your AI conversations consume. Great work from @jdelavande powered by spaces & TGI, available for a dozen of open-source models like Llama, Mistral, Qwen, Gemma and more.
New king of open VLMs: InternVL3 takes Qwen 2.5's crown! 👑
InternVL have been a wildly successful series of model : and the latest iteration has just taken back their crown thanks to their superior, natively multimodal vision training pipeline.
➡️ Most of the vision language models (VLMs) these days are built like Frankenstein : take a good text-only Large Language Model (LLM) backbone, stitch a specific vision transformer (ViT) on top of it. Then the training is sequential 🔢 : 1. Freeze the LLM weights while you train the ViT only to work with the LLM part, then 2. Unfreeze all weights to train all weights in order to work together.
💫 The Shanghai Lab decided to challenge this paradigm and chose this approach that they call "native". For each of their model sizes, they still start from a good LLM (mostly Qwen-2.5 series, did I tell you I'm a huge fan of Qwen? ❤️), and stitch the ViT, but they don't freeze anything : they train all weights together with interleaved text and image understanding data in a single pre-training phase 🎨.
They claim it results in more seamless interactions between modalities. And the results prove them right: they took the crown of top VLMs, at nearly all sizes, from their Qwen-2.5 parents. 👑