InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 9 days ago • 139
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging Paper • 2502.09056 • Published 9 days ago • 30
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published 11 days ago • 19
ARR: Question Answering with Large Language Models via Analyzing, Retrieving, and Reasoning Paper • 2502.04689 • Published 15 days ago • 7
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • 11 days ago • 35
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper • 2502.06703 • Published 11 days ago • 133
On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices Paper • 2502.04363 • Published 17 days ago • 11
Generating Symbolic World Models via Test-time Scaling of Large Language Models Paper • 2502.04728 • Published 15 days ago • 17
BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation Paper • 2502.03860 • Published 16 days ago • 23
DeepRAG: Thinking to Retrieval Step by Step for Large Language Models Paper • 2502.01142 • Published 19 days ago • 23
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control 18 days ago • 106
Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models Paper • 2501.18119 • Published 23 days ago • 24