Diffusion-Guided Backdoor Attacks in Real-World Reinforcement Learning

📅 2026-01-20

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing backdoor attacks often fail in real-world robotic systems due to safety control mechanisms such as velocity limits and action smoothing. To address this challenge, this work proposes Diffusion-Guided Backdoor Attack (DGBA), the first backdoor framework specifically designed for realistic reinforcement learning environments that can circumvent such safety constraints. DGBA leverages a conditional diffusion model to generate printable patch triggers robust to real-world visual variations and employs advantage-based poisoning to precisely target critical decision-making states. This enables stealthy and reliable attacks even under black-box control stacks. Experiments on the TurtleBot3 platform demonstrate that DGBA effectively induces targeted malicious behaviors while preserving normal task performance, confirming its efficacy and robustness in real-world robotic systems.

Technology Category

Application Category

📝 Abstract

Backdoor attacks embed hidden malicious behaviors in reinforcement learning (RL) policies and activate them using triggers at test time. Most existing attacks are validated only in simulation, while their effectiveness in real-world robotic systems remains unclear. In physical deployment, safety-constrained control pipelines such as velocity limiting, action smoothing, and collision avoidance suppress abnormal actions, causing strong attenuation of conventional backdoor attacks. We study this previously overlooked problem and propose a diffusion-guided backdoor attack framework (DGBA) for real-world RL. We design small printable visual patch triggers placed on the floor and generate them using a conditional diffusion model that produces diverse patch appearances under real-world visual variations. We treat the robot control stack as a black-box system. We further introduce an advantage-based poisoning strategy that injects triggers only at decision-critical training states. We evaluate our method on a TurtleBot3 mobile robot and demonstrate reliable activation of targeted attacks while preserving normal task performance. Demo videos and code are available in the supplementary material.

Problem

Research questions and friction points this paper is trying to address.

backdoor attacks

real-world reinforcement learning

safety-constrained control

robotic systems

trigger attenuation

Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion-guided backdoor attack

real-world reinforcement learning

visual patch trigger