A Model Stealing Attack Against Multi-Exit Networks

📅 2023-05-23

🏛️ IEEE International Conference on Acoustics, Speech, and Signal Processing

📈 Citations: 1

✨ Influential: 0

career value

222K/year

🤖 AI Summary

Existing model stealing attacks replicate only the functional behavior of multi-exit networks, neglecting their output policies—i.e., the threshold-based decision mechanisms governing early exits—resulting in stolen models that forfeit early-exit advantages and suffer substantial inference inefficiency. This work introduces the first joint stealing framework for multi-exit networks, simultaneously recovering both model functionality and exit decision policies. Methodologically, we propose a dual-objective loss function comprising performance loss and policy loss, model exit distributions via kernel density estimation, and devise an adaptive output policy search algorithm to enable end-to-end joint optimization. Extensive experiments across multiple multi-exit architectures (e.g., BranchyNet, Exit-First) and benchmark datasets (CIFAR-10/100, ImageNet-1K) demonstrate that our approach achieves the highest fidelity to the original model in both accuracy and early-exit rate—significantly outperforming state-of-the-art baselines—and establishes the first instance of high-fidelity, dual-objective (functional + efficiency-preserving) model stealing.

📝 Abstract

Compared to traditional neural networks with a single output channel, a multi-exit network has multiple exits that allow for early outputs from the model's intermediate layers, thus significantly improving computational efficiency while maintaining similar main task accuracy. Existing model stealing attacks can only steal the model's utility while failing to capture its output strategy, i.e., a set of thresholds used to determine from which exit to output. This leads to a significant decrease in computational efficiency for the extracted model, thereby losing the advantage of multi-exit networks. In this paper, we propose the first model stealing attack against multi-exit networks to extract both the model utility and the output strategy. We employ Kernel Density Estimation to analyze the target model's output strategy and use performance loss and strategy loss to guide the training of the extracted model. Furthermore, we design a novel output strategy search algorithm to maximize the consistency between the victim model and the extracted model's output behaviors. In experiments across multiple multi-exit networks and benchmark datasets, our method always achieves accuracy and efficiency closest to the victim models.

Problem

Research questions and friction points this paper is trying to address.

Stealing utility and output strategy of multi-exit networks.

Improving computational efficiency in model stealing attacks.

Maximizing consistency between victim and extracted model behaviors.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Kernel Density Estimation for strategy analysis

Employs performance and strategy loss for model training

Develops novel output strategy search algorithm

🔎 Similar Papers

Can't Hide Behind the API: Stealing Black-Box Commercial Embedding Models