๐ค AI Summary
This work addresses black-box adversarial attacks without requiring access to classifier gradients. We propose a novel attack framework based on Consensus-Based Optimization (CBO), which models population-level consensus dynamics among query samples. By theoretically connecting natural evolutionary strategies with mean-field consensus jumps, we establish their equivalence to gradient-based optimization under specific conditions. Our method jointly optimizes non-convex loss functions and employs stochastic sampling, enabling efficient generation of highly imperceptible adversarial examples without knowledge of the target modelโs internal architecture. Experiments on ImageNet and other benchmarks demonstrate substantial improvements: attack success rates increase by 5.2โ12.7%, while average query counts decrease by 38โ61% compared to state-of-the-art evolutionary attacks.
๐ Abstract
Consensus-based optimization (CBO) has established itself as an efficient gradient-free optimization scheme, with attractive mathematical properties, such as mean-field convergence results for non-convex loss functions. In this work, we study CBO in the context of closed-box adversarial attacks, which are imperceptible input perturbations that aim to fool a classifier, without accessing its gradient. Our contribution is to establish a connection between the so-called consensus hopping as introduced by Riedl et al. and natural evolution strategies (NES) commonly applied in the context of adversarial attacks and to rigorously relate both methods to gradient-based optimization schemes. Beyond that, we provide a comprehensive experimental study that shows that despite the conceptual similarities, CBO can outperform NES and other evolutionary strategies in certain scenarios.