Model-Guardian: Protecting against Data-Free Model Stealing Using Gradient Representations and Deceptive Predictions

📅 2025-03-23

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

Data-free model stealing attacks in cloud environments pose severe threats to the confidentiality of machine learning models; existing defenses suffer from low detection rates, poor generalizability, and incomplete coverage. Method: This paper proposes DFMS, a dual-module proactive defense framework: (1) DFMS-Detector jointly leverages GANs and diffusion models to generate high-fidelity synthetic query samples and detects anomalies by identifying synthetic artifacts via gradient-based representation analysis; (2) DPreds employs gradient analysis to generate deceptive predictions, enabling active adversarial deception. Contribution/Results: Evaluated across seven mainstream data-free attacks, DFMS achieves a mean detection accuracy of 98.7%, consistently outperforming 11 baseline methods. It is the first framework to synergistically integrate high-fidelity synthetic-sample-driven gradient-aware detection with adversarial prediction, establishing new state-of-the-art performance and strong cross-attack generalization capability.

Technology Category

Application Category

📝 Abstract

Model stealing attack is increasingly threatening the confidentiality of machine learning models deployed in the cloud. Recent studies reveal that adversaries can exploit data synthesis techniques to steal machine learning models even in scenarios devoid of real data, leading to data-free model stealing attacks. Existing defenses against such attacks suffer from limitations, including poor effectiveness, insufficient generalization ability, and low comprehensiveness. In response, this paper introduces a novel defense framework named Model-Guardian. Comprising two components, Data-Free Model Stealing Detector (DFMS-Detector) and Deceptive Predictions (DPreds), Model-Guardian is designed to address the shortcomings of current defenses with the help of the artifact properties of synthetic samples and gradient representations of samples. Extensive experiments on seven prevalent data-free model stealing attacks showcase the effectiveness and superior generalization ability of Model-Guardian, outperforming eleven defense methods and establishing a new state-of-the-art performance. Notably, this work pioneers the utilization of various GANs and diffusion models for generating highly realistic query samples in attacks, with Model-Guardian demonstrating accurate detection capabilities.

Problem

Research questions and friction points this paper is trying to address.

Defend against data-free model stealing attacks

Detect synthetic samples using gradient representations

Improve defense effectiveness and generalization ability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses gradient representations for detection

Employs deceptive predictions to mislead attackers

Leverages GANs and diffusion models for realism

🔎 Similar Papers

Can't Hide Behind the API: Stealing Black-Box Commercial Embedding Models