One-Step Residual Shifting Diffusion for Image Super-Resolution via Distillation Paper • 2503.13358 • Published 8 days ago • 87
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published 5 days ago • 58
Optimizing Decomposition for Optimal Claim Verification Paper • 2503.15354 • Published 6 days ago • 18
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation Paper • 2503.13288 • Published 8 days ago • 46
Implicit Reasoning in Transformers is Reasoning through Shortcuts Paper • 2503.07604 • Published 15 days ago • 21
WritingBench: A Comprehensive Benchmark for Generative Writing Paper • 2503.05244 • Published 18 days ago • 17
DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs Paper • 2503.07067 • Published 15 days ago • 29
SEAP: Training-free Sparse Expert Activation Pruning Unlock the Brainpower of Large Language Models Paper • 2503.07605 • Published 15 days ago • 65
Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders Paper • 2503.03601 • Published 20 days ago • 216
HybridNorm: Towards Stable and Efficient Transformer Training via Hybrid Normalization Paper • 2503.04598 • Published 19 days ago • 18
EuroBERT: Scaling Multilingual Encoders for European Languages Paper • 2503.05500 • Published 18 days ago • 75