Beyond Hard Writes and Rigid Preservation: Soft Recursive Least-Squares for Lifelong LLM Editing

📅 2026-01-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the plasticity–stability trade-off faced by large language models during continual editing. The authors propose a soft-constrained online quadratic optimization framework that effectively mitigates cumulative interference and forgetting of early edits. By integrating recursive least squares (RLS) with the Woodbury identity, they design an efficient recursive update mechanism that anchors pre-trained knowledge while supporting an unbounded sequence of edits at constant computational cost. Dual regularization terms are introduced to control weight deviations, enabling the model to maintain high editing success rates and strong retention of early edits over tens of thousands of consecutive updates. The method significantly outperforms existing approaches across GLUE, reasoning, and code generation benchmarks.

Technology Category

Application Category

📝 Abstract
Model editing updates a pre-trained LLM with new facts or rules without re-training, while preserving unrelated behavior. In real deployment, edits arrive as long streams, and existing editors often face a plasticity-stability dilemma: locate-then-edit"hard writes"can accumulate interference over time, while null-space-style"hard preservation"preserves only what is explicitly constrained, so past edits can be overwritten and unconstrained behaviors may deviate, degrading general capabilities in the many-edits regime. We propose RLSEdit, a recursive least-squares editor for long sequential editing. RLSEdit formulates editing as an online quadratic optimization with soft constraints, minimizing a cumulative key-value fitting objective with two regularizers that control for both deviation from the pre-trained weights and from a designated anchor mapping. The resulting update admits an efficient online recursion via the Woodbury identity, with per-edit cost independent of history length and scaling only with the current edit size. We further provide deviation bounds and an asymptotic characterization of the adherence-preservation trade-off in the many-edits regime. Experiments on multiple model families demonstrate stable scaling to 10K edits, outperforming strong baselines in both edit success and holistic stability -- crucially retaining early edits, and preserving general capabilities on GLUE and held-out reasoning/code benchmarks.
Problem

Research questions and friction points this paper is trying to address.

lifelong editing
plasticity-stability dilemma
model editing
sequential edits
capability preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

recursive least-squares
lifelong model editing
soft constraints
online optimization
plasticity-stability trade-off
🔎 Similar Papers
No similar papers found.
Xinyu Wang
Xinyu Wang
PhD student, McGill University
Large Language ModelRetrieval Augmented GenerationQuantization
S
Sicheng Lyu
McGill University, Mila—Quebec AI Institute, SimpleWay.AI
Y
Yu Gu
McGill University
J
Jerry Huang
Mila—Quebec AI Institute, Université de Montréal
Peng Lu
Peng Lu
University of Montreal
Machine LearningNatural Language Processing
Yufei Cui
Yufei Cui
McGill University, MILA
Medical AIRAGLLM AgentPredictive Uncertainty
X
Xiao-Wen Chang
McGill University