🤖 AI Summary
To address the lack of comprehensive evaluation for black-box adversarial attacks in realistic settings—where only query access is available and robustness to JPEG compression, evasion of automated detectors, and human imperceptibility must all be simultaneously ensured—this paper proposes the first unified quantitative framework for assessing “triple stealthiness.” We introduce ECLIPSE, a novel method integrating Gaussian-blur-based gradient estimation, local surrogate modeling, and multi-objective black-box optimization to jointly optimize these three stealth dimensions. On standard image classification benchmarks, ECLIPSE substantially outperforms state-of-the-art methods: it maintains high attack success rates while reducing post-JPEG accuracy degradation by 37%, lowering detection rates by mainstream defenses by 52%, and driving human subject identification rates down to near-chance level (≈51%). This work delivers the first empirically validated, end-to-end optimization of triple stealthiness under realistic constraints.
📝 Abstract
Deep learning systems, critical in domains like autonomous vehicles, are vulnerable to adversarial examples (crafted inputs designed to mislead classifiers). This study investigates black-box adversarial attacks in computer vision. This is a realistic scenario, where attackers have query-only access to the target model. Three properties are introduced to evaluate attack feasibility: robustness to compression, stealthiness to automatic detection, and stealthiness to human inspection. State-of-the-Art methods tend to prioritize one criterion at the expense of others. We propose ECLIPSE, a novel attack method employing Gaussian blurring on sampled gradients and a local surrogate model. Comprehensive experiments on a public dataset highlight ECLIPSE's advantages, demonstrating its contribution to the trade-off between the three properties.