Poisoning the Pixels: Revisiting Backdoor Attacks on Semantic Segmentation

📅 2026-03-17

📈 Citations: 0

✨ Influential: 0

career value

180K/year

🤖 AI Summary

This work addresses the vulnerability of semantic segmentation models to backdoor attacks in safety-critical scenarios, noting that existing research largely adopts threat models from image classification and overlooks attack surfaces unique to segmentation tasks. The study systematically defines a multidimensional backdoor threat space tailored to semantic segmentation, introducing four coarse-grained and two fine-grained attack vectors. It further proposes BADSEG, a unified framework that achieves high attack efficacy through optimized trigger design and label manipulation strategies. Extensive experiments demonstrate that BADSEG attains high attack success rates across diverse mainstream architectures—including Transformers and the Segment Anything Model—and benchmark datasets, while minimally degrading performance on clean samples. Moreover, six representative defense methods fail against BADSEG, exposing critical gaps in current defenses for segmentation tasks.

Technology Category

Application Category

📝 Abstract

Semantic segmentation models are widely deployed in safety-critical applications such as autonomous driving, yet their vulnerability to backdoor attacks remains largely underexplored. Prior segmentation backdoor studies transfer threat settings from existing image classification tasks, focusing primarily on object-to-background mis-segmentation. In this work, we revisit the threats by systematically examining backdoor attacks tailored to semantic segmentation. We identify four coarse-grained attack vectors (Object-to-Object, Object-to-Background, Background-to-Object, and Background-to-Background attacks), as well as two fine-grained vectors (Instance-Level and Conditional attacks). To formalize these attacks, we introduce BADSEG, a unified framework that optimizes trigger designs and applies label manipulation strategies to maximize attack performance while preserving victim model utility. Extensive experiments across diverse segmentation architectures on benchmark datasets demonstrate that BADSEG achieves high attack effectiveness with minimal impact on clean samples. We further evaluate six representative defenses and find that they fail to reliably mitigate our attacks, revealing critical gaps in current defenses. Finally, we demonstrate that these vulnerabilities persist in recent emerging architectures, including transformer-based networks and the Segment Anything Model (SAM), thereby compromising their security. Our work reveals previously overlooked security vulnerabilities in semantic segmentation, and motivates the development of defenses tailored to segmentation-specific threat models.

Problem

Research questions and friction points this paper is trying to address.

backdoor attacks

semantic segmentation

security vulnerabilities

threat models

model poisoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

backdoor attack

semantic segmentation

BADSEG