Guarding the Privacy of Label-Only Access to Neural Network Classifiers via iDP Verification

📅 2025-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the risk of individual training data privacy leakage from black-box neural network classifiers under label-only query access. We propose the first training-free intervention framework that guarantees rigorous individual differential privacy (iDP). Unlike conventional global-noise-based differential privacy (DP), our approach introduces the iDP deterministic bound (iDP-DB) and the formal verification framework LUCID, integrating mixed-integer linear programming, hypernetwork abstraction, and neuron-level linear relaxation to apply minimal, input-specific perturbations only to sensitive inputs. Experiments demonstrate that under 0-iDP (strongest privacy), accuracy drops by only 1.4%; under ε-iDP, the drop is 1.2%—substantially outperforming DP-based training methods, which incur a 12.7% accuracy loss. To our knowledge, this is the first method achieving both high prediction accuracy and provable, strict iDP guarantees for black-box classifiers in the label-only setting.

Technology Category

Application Category

📝 Abstract
Neural networks are susceptible to privacy attacks that can extract private information of the training set. To cope, several training algorithms guarantee differential privacy (DP) by adding noise to their computation. However, DP requires to add noise considering every possible training set. This leads to a significant decrease in the network's accuracy. Individual DP (iDP) restricts DP to a given training set. We observe that some inputs deterministically satisfy iDP without any noise. By identifying them, we can provide iDP label-only access to the network with a minor decrease to its accuracy. However, identifying the inputs that satisfy iDP without any noise is highly challenging. Our key idea is to compute the iDP deterministic bound (iDP-DB), which overapproximates the set of inputs that do not satisfy iDP, and add noise only to their predicted labels. To compute the tightest iDP-DB, which enables to guard the label-only access with minimal accuracy decrease, we propose LUCID, which leverages several formal verification techniques. First, it encodes the problem as a mixed-integer linear program, defined over a network and over every network trained identically but without a unique data point. Second, it abstracts a set of networks using a hyper-network. Third, it eliminates the overapproximation error via a novel branch-and-bound technique. Fourth, it bounds the differences of matching neurons in the network and the hyper-network and employs linear relaxation if they are small. We show that LUCID can provide classifiers with a perfect individuals' privacy guarantee (0-iDP) -- which is infeasible for DP training algorithms -- with an accuracy decrease of 1.4%. For more relaxed $varepsilon$-iDP guarantees, LUCID has an accuracy decrease of 1.2%. In contrast, existing DP training algorithms reduce the accuracy by 12.7%.
Problem

Research questions and friction points this paper is trying to address.

Protects privacy in label-only access to neural networks.
Minimizes accuracy loss while ensuring individual differential privacy.
Uses formal verification to compute tight iDP deterministic bounds.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses iDP deterministic bound for privacy
Employs mixed-integer linear programming
Applies novel branch-and-bound technique
🔎 Similar Papers
No similar papers found.