🤖 AI Summary
Model inversion attacks (MIAs) pose severe threats to data privacy in deep neural networks, yet existing defenses struggle to balance robustness and model utility. This paper identifies intrinsic vulnerabilities in MIAs’ fundamental workflow—revealing systematic flaws for the first time—and proposes CALoR, a novel defense framework. CALoR integrates confidence-adaptive logit clipping, low-rank compression of the classification head, and a robust classification loss specifically designed for MIA mitigation, augmented with gradient perturbation and information-leakage suppression mechanisms. Evaluated across multiple datasets, architectures, and attack settings, CALoR consistently outperforms state-of-the-art methods: it degrades MIA reconstruction quality by over 40% while incurring less than 1.5% accuracy drop. Moreover, CALoR demonstrates strong generalization and practical deployability without architectural constraints or inference-time overhead.
📝 Abstract
Model Inversion Attacks (MIAs) aim at recovering privacy-sensitive training data from the knowledge encoded in the released machine learning models. Recent advances in the MIA field have significantly enhanced the attack performance under multiple scenarios, posing serious privacy risks of Deep Neural Networks (DNNs). However, the development of defense strategies against MIAs is relatively backward to resist the latest MIAs and existing defenses fail to achieve further trade-off between model utility and model robustness. In this paper, we provide an in-depth analysis from the perspective of intrinsic vulnerabilities of MIAs, comprehensively uncovering the weaknesses inherent in the basic pipeline, which are partially investigated in the previous defenses. Building upon these new insights, we propose a robust defense mechanism, integrating Confidence Adaptation and Low-Rank compression(CALoR). Our method includes a novel robustness-enhanced classification loss specially-designed for model inversion defenses and reveals the extraordinary effectiveness of compressing the classification header. With CALoR, we can mislead the optimization objective, reduce the leaked information and impede the backpropagation of MIAs, thus mitigating the risk of privacy leakage. Extensive experimental results demonstrate that our method achieves state-of-the-art (SOTA) defense performance against MIAs and exhibits superior generalization to existing defenses across various scenarios.