🤖 AI Summary
This study addresses the challenges of diverse damage patterns and environmental interference in structural damage identification by proposing a multi-scale spatial-channel joint attention network based on DenseNet201. The method incorporates an MS-SSE module that integrates parallel depthwise convolutions to capture both local and contextual features, while synergistically combining Squeeze-and-Excitation channel attention with spatial attention mechanisms to enhance salient regions and suppress noise. Evaluated on the StructDamage dataset, the model achieves a precision of 99.31%, recall of 99.25%, F1-score of 99.27%, and accuracy of 99.26%, significantly outperforming existing approaches. These results demonstrate the proposed architecture’s superior discriminative capability and robustness in complex damage scenarios.
📝 Abstract
Structural damage detection is essential for maintaining the safety and reliability of civil infrastructure. However, accurately identifying different types of structural damage from images remains challenging due to variations in damage patterns and environmental conditions. To address these challenges, this paper proposes MS-SSE-Net, a novel deep learning (DL) framework for structural damage classification. The proposed model is built upon the DenseNet201 backbone and integrates novel multi-scale feature extraction with channel and spatial attention mechanisms (MS-SSE-Net). Specifically, parallel depthwise convolutions capture both local and contextual features, while squeeze-and-excitation style channel attention and spatial attention emphasize informative regions and suppress irrelevant noise. The refined features are then processed through global average pooling and a fully connected classification layer to generate the final predictions. Experiments are conducted on the StructDamage dataset containing multiple structural damage categories. The proposed MS-SSE-Net demonstrates superior performance compared with the baseline DenseNet201 and other comparative approaches. Specifically, the proposed method achieves 99.31% precision, 99.25% recall, 99.27% F1-score, and 99.26% accuracy, outperforming the baseline model which achieved 98.62% precision, 98.53% recall, 98.58% F1-score, and 98.53% accuracy.