Adversarial Robustness Through Artifact Design

๐Ÿ“… 2024-02-07
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Traffic sign recognition (TSR) models are highly vulnerable to physical adversarial patch attacks, while existing defense methods struggle to simultaneously ensure robustness and human readability. Method: This paper proposes, for the first time, a robust optimization framework operating at the artifact levelโ€”i.e., integrated into the traffic sign design standardization processโ€”by leveraging official physical design specifications (patterns and color schemes). The framework jointly optimizes sign designs via robust optimization modeling, gradient descent, greedy search, and adversarial training to enhance model resilience against physical adversarial attacks. Contribution/Results: On TSR benchmarks, our approach improves adversarial robust accuracy by up to 25.18% over state-of-the-art defenses, while concurrently boosting clean-sample accuracy. A user study confirms that optimized signs retain full human interpretability while significantly improving machine robustness under physical adversarial conditions.

Technology Category

Application Category

๐Ÿ“ Abstract
Adversarial examples arose as a challenge for machine learning. To hinder them, most defenses alter how models are trained (e.g., adversarial training) or inference is made (e.g., randomized smoothing). Still, while these approaches markedly improve models' adversarial robustness, models remain highly susceptible to adversarial examples. Identifying that, in certain domains such as traffic-sign recognition, objects are implemented per standards specifying how artifacts (e.g., signs) should be designed, we propose a novel approach for improving adversarial robustness. Specifically, we offer a method to redefine standards, making minor changes to existing ones, to defend against adversarial examples. We formulate the problem of artifact design as a robust optimization problem, and propose gradient-based and greedy search methods to solve it. We evaluated our approach in the domain of traffic-sign recognition, allowing it to alter traffic-sign pictograms (i.e., symbols within the signs) and their colors. We found that, combined with adversarial training, our approach led to up to 25.18% higher robust accuracy compared to state-of-the-art methods against two adversary types, while further increasing accuracy on benign inputs. Notably, a user study we conducted showed that traffic signs produced by our approach are also easily recognizable by human subjects.
Problem

Research questions and friction points this paper is trying to address.

Redesigning traffic signs to resist adversarial patch attacks
Enhancing traffic-sign recognition model robustness against physical attacks
Maintaining human interpretability while improving machine learning security
Innovation

Methods, ideas, or system contributions that make the work stand out.

Redesigning traffic signs for adversarial robustness
Optimizing sign features within human-interpretable constraints
Synthesizing realistic sign images to train robust models
๐Ÿ”Ž Similar Papers
No similar papers found.