🤖 AI Summary
Existing machine unlearning (MU) methods predominantly focus on weight adjustment, neglecting controllable interventions at the data level. This work proposes Water4MU, the first MU framework integrating digital watermarking: it embeds learnable, imperceptible watermarks into original training data via a bidirectional watermarking mechanism to explicitly attenuate the influence of target samples on the model; and employs bilevel optimization (BLO) to decouple watermark network training from model training, enabling joint optimization of watermark design and model learning. Evaluated on image classification and generation tasks, Water4MU significantly improves both forgetting accuracy and efficiency—particularly outperforming state-of-the-art methods in challenging unlearning scenarios—while preserving performance on non-target tasks. Its core contributions are twofold: (1) establishing a novel “data-driven unlearning” paradigm, and (2) introducing the first differentiable, decoupled watermarking framework specifically designed for MU.
📝 Abstract
With the increasing demand for the right to be forgotten, machine unlearning (MU) has emerged as a vital tool for enhancing trust and regulatory compliance by enabling the removal of sensitive data influences from machine learning (ML) models. However, most MU algorithms primarily rely on in-training methods to adjust model weights, with limited exploration of the benefits that data-level adjustments could bring to the unlearning process. To address this gap, we propose a novel approach that leverages digital watermarking to facilitate MU by strategically modifying data content. By integrating watermarking, we establish a controlled unlearning mechanism that enables precise removal of specified data while maintaining model utility for unrelated tasks. We first examine the impact of watermarked data on MU, finding that MU effectively generalizes to watermarked data. Building on this, we introduce an unlearning-friendly watermarking framework, termed Water4MU, to enhance unlearning effectiveness. The core of Water4MU is a bi-level optimization (BLO) framework: at the upper level, the watermarking network is optimized to minimize unlearning difficulty, while at the lower level, the model itself is trained independently of watermarking. Experimental results demonstrate that Water4MU is effective in MU across both image classification and image generation tasks. Notably, it outperforms existing methods in challenging MU scenarios, known as "challenging forgets".