YOLO-FDA: Integrating Hierarchical Attention and Detail Enhancement for Surface Defect Detection

📅 2025-06-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Industrial surface defect detection faces challenges including diverse defect morphologies, large scale variations, strong texture interference, and difficulty in fine-grained recognition. To address these, we propose an enhanced YOLO-based multi-scale defect detection framework. Our method introduces a Detail-Directed Fusion Module (DDFM) and directional asymmetric convolution to improve sensitivity to minute defects; designs attention-weighted concatenation and cross-layer attention fusion to strengthen contextual modeling; and integrates a BiFPN architecture with hierarchical attention to optimize synergistic aggregation of low-level details and high-level semantics. Extensive experiments on multiple benchmark datasets demonstrate that our approach significantly outperforms state-of-the-art methods in mAP, small-object recall, and cross-scale robustness—achieving a favorable balance between detection accuracy and generalization capability.

Technology Category

Application Category

📝 Abstract
Surface defect detection in industrial scenarios is both crucial and technically demanding due to the wide variability in defect types, irregular shapes and sizes, fine-grained requirements, and complex material textures. Although recent advances in AI-based detectors have improved performance, existing methods often suffer from redundant features, limited detail sensitivity, and weak robustness under multiscale conditions. To address these challenges, we propose YOLO-FDA, a novel YOLO-based detection framework that integrates fine-grained detail enhancement and attention-guided feature fusion. Specifically, we adopt a BiFPN-style architecture to strengthen bidirectional multilevel feature aggregation within the YOLOv5 backbone. To better capture fine structural changes, we introduce a Detail-directional Fusion Module (DDFM) that introduces a directional asymmetric convolution in the second-lowest layer to enrich spatial details and fuses the second-lowest layer with low-level features to enhance semantic consistency. Furthermore, we propose two novel attention-based fusion strategies, Attention-weighted Concatenation (AC) and Cross-layer Attention Fusion (CAF) to improve contextual representation and reduce feature noise. Extensive experiments on benchmark datasets demonstrate that YOLO-FDA consistently outperforms existing state-of-the-art methods in terms of both accuracy and robustness across diverse types of defects and scales.
Problem

Research questions and friction points this paper is trying to address.

Detects surface defects with varying types and shapes
Improves detail sensitivity in defect detection
Enhances robustness under multiscale conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

BiFPN-style architecture for feature aggregation
Detail-directional Fusion Module for spatial details
Attention-based fusion strategies for contextual representation
🔎 Similar Papers
No similar papers found.