REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation Paper • 2502.13270 • Published 3 days ago • 3
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 1 day ago • 81
view article Article Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time By rbrt and 4 others • 3 days ago • 19
Small Models, Big Impact: Efficient Corpus and Graph-Based Adaptation of Small Multilingual Language Models for Low-Resource Languages Paper • 2502.10140 • Published 8 days ago • 9
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CALM: A Unified Conversational Agentic Language Model Paper • 2502.08820 • Published 9 days ago • 4
Tiny-Agent-a Collection fast and powerful agentic models designed to run on edge devices. • 6 items • Updated 10 days ago • 7
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 9 days ago • 139
CoSER: Coordinating LLM-Based Persona Simulation of Established Roles Paper • 2502.09082 • Published 9 days ago • 27
MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency Paper • 2502.09621 • Published 8 days ago • 27
CoT-Valve: Length-Compressible Chain-of-Thought Tuning Paper • 2502.09601 • Published 8 days ago • 12
Ultravox v0.5 Collection Ultravox is a multimodal Speech LLM built around different pretrained LLMs (frozen) and the whisper-large-v3-turbo (fine-tuned) backbone. • 3 items • Updated 11 days ago • 5
TransMLA: Multi-head Latent Attention Is All You Need Paper • 2502.07864 • Published 10 days ago • 43
Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey Paper • 2502.06872 • Published 14 days ago • 8
NoLiMa: Long-Context Evaluation Beyond Literal Matching Paper • 2502.05167 • Published 14 days ago • 15