Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic Paper • 2509.01363 • Published Sep 1, 2025 • 58
Encoders vs Decoders: the Ettin Suite Collection A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 • 32 items • Updated Jul 16, 2025 • 25
FLEXITOKENS: Flexible Tokenization for Evolving Language Models Paper • 2507.12720 • Published Jul 17, 2025 • 9 • 3
FLEXITOKENS: Flexible Tokenization for Evolving Language Models Paper • 2507.12720 • Published Jul 17, 2025 • 9
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and Quantized Attention in Visual Generation Models Paper • 2506.16054 • Published Jun 19, 2025 • 60
FaithfulSAE: Towards Capturing Faithful Features with Sparse Autoencoders without External Dataset Dependencies Paper • 2506.17673 • Published Jun 21, 2025 • 7
Steering Conceptual Bias via Transformer Latent-Subspace Activation Paper • 2506.18887 • Published Jun 23, 2025 • 6
RuleReasoner: Reinforced Rule-based Reasoning via Domain-aware Dynamic Sampling Paper • 2506.08672 • Published Jun 10, 2025 • 30
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2, 2025 • 187
Shifting AI Efficiency From Model-Centric to Data-Centric Compression Paper • 2505.19147 • Published May 25, 2025 • 144