MoEEdit: Efficient and Routing-Stable Knowledge Editing for Mixture-of-Experts LLMs

๐Ÿ“… 2026-02-11
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

241K/year
๐Ÿค– AI Summary
Existing knowledge editing methods for sparse Mixture-of-Experts (MoE) large language models often induce routing distribution shifts and incur high computational costs. To address this, this work proposes MoEEditโ€”the first routing-stable, parameter-editing framework tailored for MoE models. MoEEdit preserves the routerโ€™s input representation via expert-level null-space projection to mitigate routing drift and employs block coordinate descent for efficient computation of edit updates. This approach enables the first effective and stable parameter-level editing in MoE architectures, achieving state-of-the-art editing performance and generalization while maintaining high specificity and routing stability. Moreover, MoEEdit significantly improves computational and memory efficiency compared to existing methods.

Technology Category

Application Category

๐Ÿ“ Abstract
Knowledge editing (KE) enables precise modifications to factual content in large language models (LLMs). Existing KE methods are largely designed for dense architectures, limiting their applicability to the increasingly prevalent sparse Mixture-of-Experts (MoE) models that underpin modern scalable LLMs. Although MoEs offer strong efficiency and capacity scaling, naively adapting dense-model editors is both computationally costly and prone to routing distribution shifts that undermine stability and consistency. To address these challenges, we introduce MoEEdit, the first routing-stable framework for parameter-modifying knowledge editing in MoE LLMs. Our method reparameterizes expert updates via per-expert null-space projections that keep router inputs invariant and thereby suppress routing shifts. The resulting block-structured optimization is solved efficiently with a block coordinate descent (BCD) solver. Experiments show that MoEEdit attains state-of-the-art efficacy and generalization while preserving high specificity and routing stability, with superior compute and memory efficiency. These results establish a robust foundation for scalable, precise knowledge editing in sparse LLMs and underscore the importance of routing-stable interventions.
Problem

Research questions and friction points this paper is trying to address.

Knowledge Editing
Mixture-of-Experts
Routing Stability
Sparse LLMs
Parameter Modification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts
knowledge editing
routing stability
null-space projection
block coordinate descent
๐Ÿ”Ž Similar Papers
2024-05-06Conference on Empirical Methods in Natural Language ProcessingCitations: 9