view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance By tngtech • Apr 16 • 25
view article Article How Long Prompts Block Other Requests - Optimizing LLM Performance By tngtech • Jun 12 • 5
view article Article What's Software 3.0? (Spoiler: You're Already Using It) By fdaudens • Jun 19 • 2
view article Article ScreenEnv: Deploy your full stack Desktop Agent By A-Mahla and 1 other • 16 days ago • 51
view article Article Transformers Are Getting Old: Variants and Alternatives Exist! By ProCreations • 21 days ago • 42
view article Article Should We Still Pretrain Encoders with Masked Language Modeling? By Nicolas-BZRD and 3 others • 24 days ago • 21
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 115
view article Article Microsoft and Hugging Face expand collaboration By jeffboudier and 2 others • May 19 • 23
Granite Time Series Models Collection A collection of time series models trained by IBM licensed under Apache 2.0 license. • 7 items • Updated Jun 16 • 29
view article Article The New and Fresh analytics in Inference Endpoints By erikkaum and 4 others • Mar 21 • 21
view article Article Blazingly fast whisper transcriptions with Inference Endpoints By mfuntowicz and 5 others • May 13 • 72
view article Article Releasing the largest multilingual open pretraining dataset By Pclanglais and 2 others • Nov 13, 2024 • 102