SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Paper • 2502.09390 • Published 9 days ago • 16
SQuARE: Sequential Question Answering Reasoning Engine for Enhanced Chain-of-Thought in Large Language Models Paper • 2502.09390 • Published 9 days ago • 16
view article Article Universal Assisted Generation: Faster Decoding with Any Assistant Model Oct 29, 2024 • 52
CoTAR: Chain-of-Thought Attribution Reasoning with Multi-level Granularity Paper • 2404.10513 • Published Apr 16, 2024 • 2
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Paper • 2408.02545 • Published Aug 5, 2024 • 36
RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation Paper • 2408.02545 • Published Aug 5, 2024 • 36
Distributed Speculative Inference of Large Language Models Paper • 2405.14105 • Published May 23, 2024 • 17
Distributed Speculative Inference of Large Language Models Paper • 2405.14105 • Published May 23, 2024 • 17
view article Article Blazing Fast SetFit Inference with 🤗 Optimum Intel on Xeon Apr 3, 2024 • 11
view article Article Accelerate StarCoder with 🤗 Optimum Intel on Xeon: Q8/Q4 and Speculative Decoding Jan 30, 2024 • 9
view article Article SetFitABSA: Few-Shot Aspect Based Sentiment Analysis using SetFit Dec 6, 2023 • 6