🤖 AI Summary
Existing adversarial attack methods primarily focus on single-label settings, neglect input perturbation costs, and suffer from low optimization efficiency. This paper proposes the Jacobian-Mahalanobis Attack (JMA), the first targeted attack framework integrating Wolfe duality and non-negative least squares (NNLS). JMA solves the linearized adversarial problem by minimizing a Jacobian-induced Mahalanobis distance, enabling unified support for single-label, multi-label, and arbitrary output encoding schemes. Theoretically rigorous and computationally efficient, JMA requires significantly fewer iterations than state-of-the-art methods. On multi-label tasks, it simultaneously flips approximately 50% of labels—overcoming the limitations of one-hot encoding—and achieves optimal or near-optimal attack performance across diverse output encodings.
📝 Abstract
Most of the approaches proposed so far to craft targeted adversarial examples against Deep Learning classifiers are highly suboptimal and typically rely on increasing the likelihood of the target class, thus implicitly focusing on one-hot encoding settings. In this paper, we propose a more general, theoretically sound, targeted attack that resorts to the minimization of a Jacobian-induced MAhalanobis distance (JMA) term, taking into account the effort (in the input space) required to move the latent space representation of the input sample in a given direction. The minimization is solved by exploiting the Wolfe duality theorem, reducing the problem to the solution of a Non-Negative Least Square (NNLS) problem. The proposed algorithm provides an optimal solution to a linearized version of the adversarial example problem originally introduced by Szegedy et al. cite{szegedy2013intriguing}. The experiments we carried out confirm the generality of the proposed attack which is proven to be effective under a wide variety of output encoding schemes. Noticeably, the JMA attack is also effective in a multi-label classification scenario, being capable to induce a targeted modification of up to half the labels in a complex multilabel classification scenario with 20 labels, a capability that is out of reach of all the attacks proposed so far. As a further advantage, the JMA attack usually requires very few iterations, thus resulting more efficient than existing methods.