Overcoming Forgetting in LLM Fine-Tuning with Evolution Strategies

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the common observation that large language models fine-tuned via evolutionary strategies (ES) often exhibit performance degradation on previously learned tasks—a phenomenon frequently misattributed to irreversible catastrophic forgetting. The study demonstrates that this decline is in fact a recoverable form of performance drift, not unique to ES. To mitigate this issue, the authors propose Anchored Weight Decay (AWD), a lightweight regularization method that stabilizes performance on old tasks by constraining the optimization trajectory to remain proximate to the initial model in parameter space. Experimental results show that AWD substantially alleviates forgetting without compromising performance on the target task, achieving stability comparable to that of large ES populations at minimal computational overhead, thereby affirming the viability of ES for continual learning.

📝 Abstract

Evolution Strategies (ES) has recently emerged as a competitive alternative to reinforcement learning (RL) for large language model (LLM) fine-tuning, offering advantages through simplicity, scalability, and inference-only training. However, recent work suggests that ES fine-tuning on new tasks may induce forgetting of prior tasks. First, this paper shows that prior task forgetting (1) is better characterized as performance drift rather than irreversible forgetting, with prior-task performance often recovering during ES training; and (2) is not a specific failure mode of ES, but can also arise for fine-tuning with RL methods. Second, it analyzes when and why such drift arises, highlighting its dependence on ES training dynamics, particularly random walk behavior in weakly constrained directions of the weight space. Third, based on these insights, it introduces Anchored Weight Decay (AWD) as a parameter-space regularization technique that constrains optimization toward the initial model parameters. AWD effectively stabilizes prior-task performance while preserving target-task performance, achieving benefits comparable to large ES population sizes at much lower computational cost. Thus, contrary to previous beliefs, the paper shows that prior-task forgetting under ES is largely avoidable, positioning ES as a promising approach for continual learning in LLMs.

Problem

Research questions and friction points this paper is trying to address.

catastrophic forgetting

large language models

evolution strategies

continual learning

fine-tuning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolution Strategies

Catastrophic Forgetting

Anchored Weight Decay