DeltaSeg: Tiered Attention and Deep Delta Learning for Multi-Class Structural Defect Segmentation

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

244K/year
🤖 AI Summary
This work addresses the challenges of structural defect segmentation—namely, diverse damage types, extreme class imbalance, and ambiguous boundary localization—by proposing DeltaSeg, a U-shaped encoder-decoder network that integrates hierarchical attention with deep Delta learning. The method introduces a novel Deep Delta Attention (DDA) module featuring a dual-path mechanism to suppress irrelevant features while enhancing spatial awareness, forming a three-level hierarchy of channel, coordinate, and Delta attention. DDA is embedded within skip connections and further combined with depthwise separable convolutions, dilated convolutions, ASPP, squeeze-and-excitation (SE), and coordinate attention modules, complemented by multi-scale deep supervision. Evaluated on the S2DS (7-class) and CSDD (9-class) datasets, DeltaSeg substantially outperforms twelve state-of-the-art models, demonstrating exceptional generalization across damage types, imaging conditions, and structural geometries.

Technology Category

Application Category

📝 Abstract
Automated segmentation of structural defects from visual inspection imagery remains challenging due to the diversity of damage types, extreme class imbalance, and the need for precise boundary delineation. This paper presents DeltaSeg, a U-shaped encoder-decoder architecture with a tiered attention strategy that integrates Squeeze-and-Excitation (SE) channel attention in the encoder, Coordinate Attention at the bottleneck and decoder, and a novel Deep Delta Attention (DDA) mechanism in the skip connections. The encoder uses depthwise separable convolutions with dilated stages to maintain spatial resolution while expanding the receptive field. Atrous Spatial Pyramid Pooling (ASPP) at the bottleneck captures multi-scale context. The DDA module refines skip connections through a dual-path scheme combining a learned delta operator for nuisance feature suppression with spatial attention gates conditioned on decoder signals. Deep supervision through multi-scale auxiliary heads further strengthens gradient flow and encourages semantically meaningful features at intermediate decoder stages. We evaluate DeltaSeg on two datasets: the S2DS dataset (7 classes) and the Culvert-Sewer Defect Dataset (CSDD, 9 classes). Across both benchmarks, DeltaSeg consistently outperforms 12 competing architectures including U-Net, SA-UNet, UNet3+, SegFormer, Swin-UNet, EGE-UNet, FPN, and Mobile-UNETR, demonstrating strong generalization across damage types, imaging conditions, and structural geometries.
Problem

Research questions and friction points this paper is trying to address.

structural defect segmentation
class imbalance
boundary delineation
multi-class segmentation
visual inspection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Delta Attention
Tiered Attention
Skip Connection Refinement
Multi-scale Deep Supervision
Dilated Depthwise Separable Convolution