view article Article Building the Hugging Face MCP Server By evalstate and 3 others • 4 days ago • 35
view article Article Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders By thomwolf and 1 other • 5 days ago • 525
Running on CPU Upgrade 9.31k 9.31k Kolors Virtual Try-On 👕 Upload images to see virtual try-on results
Radial Attention: $O(n\log n)$ Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published 19 days ago • 38 • 3
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation Paper • 2506.19852 • Published 19 days ago • 38
view post Post 2697 Inference for generative ai models looks like a mine field, but there’s a simple protocol for picking the best inference:🌍 95% of users >> If you’re using open (large) models and need fast online inference, then use Inference providers on auto mode, and let it choose the best provider for the model. https://huggingface.co/docs/inference-providers/index👷 fine-tuners/ bespoke >> If you’ve got custom setups, use Inference Endpoints to define a configuration from AWS, Azure, GCP. https://endpoints.huggingface.co/🦫 Locals >> If you’re trying to stretch everything you can out of a server or local machine, use Llama.cpp, Jan, LMStudio or vLLM. https://huggingface.co/settings/local-apps#local-apps🪟 Browsers >> If you need open models running right here in the browser, use transformers.js. https://github.com/huggingface/transformers.jsLet me know what you’re using, and if you think it’s more complex than this. See translation ❤️ 10 10 + Reply