ARMOR: Shielding Unlearnable Examples against Data Augmentation

📅 2025-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing data augmentation techniques undermine the privacy-preserving mechanism of unlearnable examples—samples intentionally designed to be non-learnable by models—rendering originally private data learnable post-augmentation. To address this, we propose ARMOR, a robust defense framework that systematically characterizes, for the first time, how augmentation compromises unlearnability. ARMOR introduces three key innovations: (1) a non-local feature modeling surrogate model to better approximate learning dynamics; (2) a class-aware augmentation strategy selection mechanism to preserve label-invariant unlearnability; and (3) a dynamic-step noise optimization algorithm to enhance adversarial robustness. Evaluated across four benchmark datasets and five common augmentation methods, ARMOR reduces model test accuracy by 60% on average compared to six state-of-the-art baselines, significantly improving the robustness of unlearnable examples under augmentation and strengthening privacy guarantees.

Technology Category

Application Category

📝 Abstract
Private data, when published online, may be collected by unauthorized parties to train deep neural networks (DNNs). To protect privacy, defensive noises can be added to original samples to degrade their learnability by DNNs. Recently, unlearnable examples are proposed to minimize the training loss such that the model learns almost nothing. However, raw data are often pre-processed before being used for training, which may restore the private information of protected data. In this paper, we reveal the data privacy violation induced by data augmentation, a commonly used data pre-processing technique to improve model generalization capability, which is the first of its kind as far as we are concerned. We demonstrate that data augmentation can significantly raise the accuracy of the model trained on unlearnable examples from 21.3% to 66.1%. To address this issue, we propose a defense framework, dubbed ARMOR, to protect data privacy from potential breaches of data augmentation. To overcome the difficulty of having no access to the model training process, we design a non-local module-assisted surrogate model that better captures the effect of data augmentation. In addition, we design a surrogate augmentation selection strategy that maximizes distribution alignment between augmented and non-augmented samples, to choose the optimal augmentation strategy for each class. We also use a dynamic step size adjustment algorithm to enhance the defensive noise generation process. Extensive experiments are conducted on 4 datasets and 5 data augmentation methods to verify the performance of ARMOR. Comparisons with 6 state-of-the-art defense methods have demonstrated that ARMOR can preserve the unlearnability of protected private data under data augmentation. ARMOR reduces the test accuracy of the model trained on augmented protected samples by as much as 60% more than baselines.
Problem

Research questions and friction points this paper is trying to address.

Privacy Leakage
Data Augmentation
Protection System
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data Augmentation
Privacy Protection
ARMOR Defense System
🔎 Similar Papers
No similar papers found.
Xueluan Gong
Xueluan Gong
Nanyang Technological University
Computer science
Y
Yuji Wang
School of Cyber Science and Engineering, Wuhan University, China
Yanjiao Chen
Yanjiao Chen
College of Electrical Engineering, Zhejiang University
Wireless networksnetwork securityInternet of Things
H
Haocheng Dong
School of Cyber Science and Engineering, Wuhan University, China
Y
Yiming Li
ZJU-Hangzhou Global Scientific and Technological Innovation Center (HIC) and State Key Laboratory of Blockchain and Data Security, Zhejiang University, China
M
Mengyuan Sun
School of Cyber Science and Engineering, Wuhan University, China
Shuaike Li
Shuaike Li
University of Science and Technology of China
Q
Qian Wang
School of Cyber Science and Engineering, Wuhan University, China
C
Chen Chen
Nanyang Technological University, Singapore