Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms Paper • 2505.20322 • Published 14 days ago • 14
view article Article Mixture of Tunable Experts - Behavior Modification of DeepSeek-R1 at Inference Time By rbrt and 4 others • Feb 18 • 33
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training Paper • 2505.14681 • Published 17 days ago • 9
Knowledge Augmented Complex Problem Solving with Large Language Models: A Survey Paper • 2505.03418 • Published May 6 • 8
A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment Paper • 2504.15585 • Published Apr 22 • 13
EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models Paper • 2504.15133 • Published Apr 21 • 22
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement Paper • 2504.03561 • Published Apr 4 • 18
ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging Paper • 2503.21088 • Published Mar 27 • 8
ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems Paper • 2503.20756 • Published Mar 26 • 7
LookAhead Tuning: Safer Language Models via Partial Answer Previews Paper • 2503.19041 • Published Mar 24 • 5
CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners Paper • 2503.16356 • Published Mar 20 • 15
view article Article How to Reduce Memory Use in Reasoning Models By Kseniase and 1 other • Mar 13 • 14
BiasEdit: Debiasing Stereotyped Language Models via Model Editing Paper • 2503.08588 • Published Mar 11 • 7
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published Feb 20 • 175
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training Paper • 2502.11196 • Published Feb 16 • 22
ReLearn: Unlearning via Learning for Large Language Models Paper • 2502.11190 • Published Feb 16 • 29