Almost for Free: Crafting Adversarial Examples with Convolutional Image Filters

📅 2026-05-01

📈 Citations: 0

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work proposes a lightweight adversarial attack method inspired by interpretability principles, circumventing the high computational cost of traditional approaches that rely on gradients or extensive model queries. Drawing on classical edge detection concepts, the authors design a simple 3×3 convolutional image filter that generates non-targeted adversarial examples via a single forward pass—without requiring gradient computation or generative models. The proposed filter reduces parameter count by five orders of magnitude compared to existing techniques, while exhibiting structural similarities to classical image operators. Empirical evaluations demonstrate transferable attack success rates of 30%–80% across diverse models, substantially improving efficiency and offering new insights into the mechanisms underlying model vulnerability.

📝 Abstract

Adversarial examples in machine learning are typically generated using gradients, obtained either directly through access to the model or approximated via queries to it. In this paper, we propose a much simpler approach to craft adversarial examples, drawing inspiration from insights of explainable machine learning. In particular, we design \emph{adversarial image filters} that are based on classic edge detection algorithms but optimized to deceive learning models. The resulting untargeted attacks are transferable and require only a single pass over the input. Empirically, we find that 3x3 filters already enable success rates between 30% and 80% on different neural networks. Compared to related approaches using generative models for crafting adversarial examples, we reduce the number of parameters by five orders of magnitude, resulting in a very efficient attack. When investigating the parameters of the learned filters, we observe interesting properties such as a high transferability between models and structures common to classic image filters. Our results provide further insights into the vulnerability of neural networks and their fragility to malicious noise.

Problem

Research questions and friction points this paper is trying to address.

adversarial examples

convolutional filters

model vulnerability

transferable attacks

efficient attack

Innovation

Methods, ideas, or system contributions that make the work stand out.

adversarial image filters

convolutional filters

transferable attacks