Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning Paper • 2507.16795 • Published 7 days ago • 2
Monet: Mixture of Monosemantic Experts for Transformers Paper • 2412.04139 • Published Dec 5, 2024 • 14
🥨 Bavarian NLP Papers Collection Awesome papers about Bavarian NLP • 9 items • Updated 15 days ago • 2
view article Article Bringing Fusion Down to Earth: ML for Stellarator Optimization By cgeorgiaw • 27 days ago • 70
TuCo: Measuring the Contribution of Fine-Tuning to Individual Responses of LLMs Paper • 2506.23423 • Published 30 days ago • 1
ELI-Why Collection 🧠 ELI-Why: Evaluating the Pedagogical Utility of Language Model Explanations ACL Findings 2025 • 4 items • Updated Jun 11 • 3
view article Article Enhance Your Models in 5 Minutes with the Hugging Face Kernel Hub By drbh and 6 others • Jun 12 • 116
Decomposing MLP Activations into Interpretable Features via Semi-Nonnegative Matrix Factorization Paper • 2506.10920 • Published Jun 12 • 6
From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit Paper • 2506.03093 • Published Jun 3 • 2
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 162
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning Paper • 2506.01939 • Published Jun 2 • 176
view article Article The Transformers Library: standardizing model definitions By lysandre and 3 others • May 15 • 116
view article Article *Context Is Gold to Find the Gold Passage*: Evaluating and Training Contextual Document Embeddings By manu and 1 other • Jun 2 • 24
FAMA Collection The First Large-Scale Open-Science Speech Foundation Model for English and Italian • 5 items • Updated May 30 • 10
Unsupervised Word-level Quality Estimation for Machine Translation Through the Lens of Annotators (Dis)agreement Paper • 2505.23183 • Published May 29 • 2
SAEs Are Good for Steering -- If You Select the Right Features Paper • 2505.20063 • Published May 26 • 1
Mechanistic evaluation of Transformers and state space models Paper • 2505.15105 • Published May 21 • 1