view article Article Speeding Up LLM Decoding with Advanced Universal Assisted Generation Techniques By jmamou and 8 others • Mar 24 • 18
view article Article Universal Assisted Generation: Faster Decoding with Any Assistant Model By danielkorat and 7 others • Oct 29, 2024 • 55
view article Article CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG By peterizsak and 5 others • Mar 15, 2024 • 10