Adversarial Robustness Through Artifact Design

📅 2024-02-07

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

Traffic sign recognition (TSR) models are highly vulnerable to physical adversarial patch attacks, while existing defense methods struggle to simultaneously ensure robustness and human readability. Method: This paper proposes, for the first time, a robust optimization framework operating at the artifact level—i.e., integrated into the traffic sign design standardization process—by leveraging official physical design specifications (patterns and color schemes). The framework jointly optimizes sign designs via robust optimization modeling, gradient descent, greedy search, and adversarial training to enhance model resilience against physical adversarial attacks. Contribution/Results: On TSR benchmarks, our approach improves adversarial robust accuracy by up to 25.18% over state-of-the-art defenses, while concurrently boosting clean-sample accuracy. A user study confirms that optimized signs retain full human interpretability while significantly improving machine robustness under physical adversarial conditions.

Technology Category

Application Category

📝 Abstract

Adversarial examples arose as a challenge for machine learning. To hinder them, most defenses alter how models are trained (e.g., adversarial training) or inference is made (e.g., randomized smoothing). Still, while these approaches markedly improve models' adversarial robustness, models remain highly susceptible to adversarial examples. Identifying that, in certain domains such as traffic-sign recognition, objects are implemented per standards specifying how artifacts (e.g., signs) should be designed, we propose a novel approach for improving adversarial robustness. Specifically, we offer a method to redefine standards, making minor changes to existing ones, to defend against adversarial examples. We formulate the problem of artifact design as a robust optimization problem, and propose gradient-based and greedy search methods to solve it. We evaluated our approach in the domain of traffic-sign recognition, allowing it to alter traffic-sign pictograms (i.e., symbols within the signs) and their colors. We found that, combined with adversarial training, our approach led to up to 25.18% higher robust accuracy compared to state-of-the-art methods against two adversary types, while further increasing accuracy on benign inputs. Notably, a user study we conducted showed that traffic signs produced by our approach are also easily recognizable by human subjects.

Problem

Research questions and friction points this paper is trying to address.

Redesigning traffic signs to resist adversarial patch attacks

Enhancing traffic-sign recognition model robustness against physical attacks

Maintaining human interpretability while improving machine learning security

Innovation

Methods, ideas, or system contributions that make the work stand out.

Redesigning traffic signs for adversarial robustness

Optimizing sign features within human-interpretable constraints

Synthesizing realistic sign images to train robust models

🔎 Similar Papers

No similar papers found.