Contract And Conquer: How to Provably Compute Adversarial Examples for a Black-Box Model?

📅 2026-03-11

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work proposes a provable black-box adversarial attack method that addresses the lack of theoretical guarantees in existing approaches, which often fail to ensure successful generation of adversarial examples within a bounded number of iterations. By performing knowledge distillation on an expanded distilled dataset and incorporating a precise search space contraction strategy, the proposed method provides the first theoretical convergence guarantee for black-box attacks, proving that an effective adversarial example can always be found within a fixed number of iterations. Integrating knowledge distillation, spatial constraints, and transferability analysis, the approach significantly outperforms state-of-the-art black-box attacks on ImageNet and demonstrates broad applicability across diverse model architectures, including CNNs and Vision Transformers.

Technology Category

Application Category

📝 Abstract

Black-box adversarial attacks are widely used as tools to test the robustness of deep neural networks against malicious perturbations of input data aimed at a specific change in the output of the model. Such methods, although they remain empirically effective, usually do not guarantee that an adversarial example can be found for a particular model. In this paper, we propose Contract And Conquer (CAC), an approach to provably compute adversarial examples for neural networks in a black-box manner. The method is based on knowledge distillation of a black-box model on an expanding distillation dataset and precise contraction of the adversarial example search space. CAC is supported by the transferability guarantee: we prove that the method yields an adversarial example for the black-box model within a fixed number of algorithm iterations. Experimentally, we demonstrate that the proposed approach outperforms existing state-of-the-art black-box attack methods on ImageNet dataset for different target models, including vision transformers.

Problem

Research questions and friction points this paper is trying to address.

black-box adversarial attacks

adversarial examples

provable guarantees

neural network robustness

transferability

Innovation

Methods, ideas, or system contributions that make the work stand out.

black-box adversarial attack

provably compute

knowledge distillation