🤖 AI Summary
To address performance limitations in multi-scale, multi-type road damage detection caused by data scarcity, this paper introduces DRDD—the first benchmark dataset featuring rich morphological and scale variations—and proposes RDD4D, a novel detection model. Methodologically, RDD4D integrates a Transformer-enhanced CNN backbone with a multi-scale feature fusion architecture. Its core innovation is the Attention4D module, the first to jointly incorporate four-dimensional attention (spatial, channel, scale, and semantic) alongside positional encoding and the Talking-Head mechanism for cross-scale feature refinement. Evaluated on DRDD, RDD4D achieves an mAP of 0.445 (with 0.458 AP for large cracks), substantially outperforming state-of-the-art methods; it further improves mAP by approximately 0.21 on CrackTinyNet. The code, pre-trained models, and the DRDD dataset are fully open-sourced.
📝 Abstract
Road damage detection and assessment are crucial components of infrastructure maintenance. However, current methods often struggle with detecting multiple types of road damage in a single image, particularly at varying scales. This is due to the lack of road datasets with various damage types having varying scales. To overcome this deficiency, first, we present a novel dataset called Diverse Road Damage Dataset (DRDD) for road damage detection that captures the diverse road damage types in individual images, addressing a crucial gap in existing datasets. Then, we provide our model, RDD4D, that exploits Attention4D blocks, enabling better feature refinement across multiple scales. The Attention4D module processes feature maps through an attention mechanism combining positional encoding and"Talking Head"components to capture local and global contextual information. In our comprehensive experimental analysis comparing various state-of-the-art models on our proposed, our enhanced model demonstrated superior performance in detecting large-sized road cracks with an Average Precision (AP) of 0.458 and maintained competitive performance with an overall AP of 0.445. Moreover, we also provide results on the CrackTinyNet dataset; our model achieved around a 0.21 increase in performance. The code, model weights, dataset, and our results are available on href{https://github.com/msaqib17/Road_Damage_Detection}{https://github.com/msaqib17/Road_Damage_Detection}.