Per-parameter Task Arithmetic for Unlearning in Large Language Models

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the challenge of catastrophic over-unlearning in large language models, where removing specific private information often degrades performance on unrelated tasks. To mitigate this issue, the authors propose Parameter-wise Task Arithmetic (PerTA), a method that enables precise unlearning by rescaling task vectors at the individual parameter level while preserving general knowledge. PerTA estimates the importance of each parameter either through gradients (PerTA-grad) or diagonal Fisher information (PerTA-fisher), allowing fine-grained control over the forgetting process. Experimental results demonstrate that PerTA consistently outperforms standard task arithmetic approaches and various training-based unlearning techniques, achieving superior trade-offs between effective removal of target information and retention of overall model utility.

Technology Category

Application Category

📝 Abstract

In large language model (LLM) unlearning, private information is required to be removed. Task arithmetic unlearns by subtracting a specific task vector (TV)--defined as the parameter difference between a privacy-information-tuned model and the original model. While efficient, it can cause over-forgetting by disrupting parameters essential for retaining other information. Motivated by the observation that each parameter exhibits different importance for forgetting versus retention, we propose a per-parameter task arithmetic (PerTA) mechanism to rescale the TV, allowing per-parameter adjustment. These weights quantify the relative importance of each parameter for forgetting versus retention, estimated via gradients (i.e., PerTA-grad) or the diagonal Fisher information approximation (i.e., PerTA-fisher). Moreover, we discuss the effectiveness of PerTA, extend it to a more general form, and provide further analysis. Extensive experiments demonstrate that PerTA consistently improves upon standard TV, and in many cases surpasses widely used training-based unlearning methods in both forgetting effectiveness and overall model utility. By retaining the efficiency of task arithmetic while mitigating over-forgetting, PerTA offers a principled and practical framework for LLM unlearning.

Problem

Research questions and friction points this paper is trying to address.

unlearning

large language models

over-forgetting

task arithmetic

privacy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Per-parameter Task Arithmetic

LLM unlearning

Task Vector Rescaling