view article Article Prefill and Decode for Concurrent Requests - Optimizing LLM Performance By tngtech • Apr 16 • 25
view article Article How Long Prompts Block Other Requests - Optimizing LLM Performance By tngtech • Jun 12 • 5
view article Article What's Software 3.0? (Spoiler: You're Already Using It) By fdaudens • Jun 19 • 2
view article Article ScreenEnv: Deploy your full stack Desktop Agent By A-Mahla and 1 other • 16 days ago • 51
view article Article Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure By jcudit • 18 days ago • 8
view article Article Transformers Are Getting Old: Variants and Alternatives Exist! By ProCreations • 21 days ago • 42
view article Article Should We Still Pretrain Encoders with Masked Language Modeling? By Nicolas-BZRD and 3 others • 24 days ago • 21
view post Post 2536 so many multimodal releases these days 🤠> ERNIE-4.5-VL: new vision language MoE models by Baidu https://huggingface.co/models?search=ernie-4.5-vl> new visual document retrievers by NVIDIA (sota on ViDoRe!) nvidia/llama-nemoretriever-colembed-3b-v1 nvidia/llama-nemoretriever-colembed-1b-v1> Ovis-3b: new image-text in image-text out models by Alibaba ⤵️ https://huggingface.co/spaces/AIDC-AI/Ovis-U1- See translation 🚀 6 6 + Reply