Evil from Within: Machine Learning Backdoors through Hardware Trojans

📅 2023-04-17

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 1

career value

240K/year

🤖 AI Summary

This work demonstrates the feasibility and severity of a novel hardware-level backdoor attack targeting machine learning accelerators. Specifically, it shows how configurable, model-aware hardware trojans can be embedded into AI chips—such as the Xilinx Vitis AI DPU—to enable stealthy, software- and model-unmodified adversarial behavior without retraining. The proposed “minimal backdoor” paradigm achieves malicious functionality by altering only 30 parameters (0.069% of the model), validated on a traffic sign recognition system. Critically, the trojan incurs zero runtime overhead and minimal hardware cost—just 0.24% area increase. It evades all existing software- and algorithm-level defenses and exhibits extreme detection resistance. This study provides the first systematic evidence of practical, hardware-enabled model hijacking, establishing a critical security baseline and urgent warning for safety-critical AI systems, particularly autonomous driving platforms reliant on trusted hardware acceleration.

📝 Abstract

Backdoors pose a serious threat to machine learning, as they can compromise the integrity of security-critical systems, such as self-driving cars. While different defenses have been proposed to address this threat, they all rely on the assumption that the hardware on which the learning models are executed during inference is trusted. In this paper, we challenge this assumption and introduce a backdoor attack that completely resides within a common hardware accelerator for machine learning. Outside of the accelerator, neither the learning model nor the software is manipulated, so that current defenses fail. To make this attack practical, we overcome two challenges: First, as memory on a hardware accelerator is severely limited, we introduce the concept of a minimal backdoor that deviates as little as possible from the original model and is activated by replacing a few model parameters only. Second, we develop a configurable hardware trojan that can be provisioned with the backdoor and performs a replacement only when the specific target model is processed. We demonstrate the practical feasibility of our attack by implanting our hardware trojan into the Xilinx Vitis AI DPU, a commercial machine-learning accelerator. We configure the trojan with a minimal backdoor for a traffic-sign recognition system. The backdoor replaces only 30 (0.069%) model parameters, yet it reliably manipulates the recognition once the input contains a backdoor trigger. Our attack expands the hardware circuit of the accelerator by 0.24% and induces no run-time overhead, rendering a detection hardly possible. Given the complex and highly distributed manufacturing process of current hardware, our work points to a new threat in machine learning that is inaccessible to current security mechanisms and calls for hardware to be manufactured only in fully trusted environments.

Problem

Research questions and friction points this paper is trying to address.

Backdoor Attacks

Machine Learning Hardware

Security Threats

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hardware-Targeted Backdoor Attack

Minimal Backdoor Design

Reconfigurable Hardware Trojan

🔎 Similar Papers

No similar papers found.