Crack-EdgeSAM Self-Prompting Crack Segmentation System for Edge Devices

📅 2024-12-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address the challenges of low efficiency, poor accuracy, and difficult deployment for concrete crack detection and segmentation on edge devices in structural health monitoring (SHM), this paper proposes a lightweight, edge-native self-prompting segmentation system. Methodologically, we introduce the first joint architecture integrating YOLOv8—used to automatically generate bounding-box prompts—with EdgeSAM, and design ConvLoRA, a parameter-efficient fine-tuning strategy tailored for edge vision models. Training is further optimized using a Dice-Focal hybrid loss function. Our system achieves real-time inference at 8 FPS on Jetson Orin Nano with 1024×1024 input resolution, surpassing existing state-of-the-art methods in accuracy. Validated on a wall-climbing robot inspection platform, it demonstrates high segmentation accuracy and strong robustness under varying environmental conditions. Key contributions include: (1) the first self-prompting segmentation framework specifically designed for edge-based SHM; (2) the ConvLoRA fine-tuning paradigm for efficient adaptation of vision foundation models on resource-constrained devices; and (3) an end-to-end deployable, lightweight solution fully optimized for edge deployment.

Technology Category

Application Category

📝 Abstract

Structural health monitoring (SHM) is essential for the early detection of infrastructure defects, such as cracks in concrete bridge pier. but often faces challenges in efficiency and accuracy in complex environments. Although the Segment Anything Model (SAM) achieves excellent segmentation performance, its computational demands limit its suitability for real-time applications on edge devices. To address these challenges, this paper proposes Crack-EdgeSAM, a self-prompting crack segmentation system that integrates YOLOv8 for generating prompt boxes and a fine-tuned EdgeSAM model for crack segmentation. To ensure computational efficiency, the method employs ConvLoRA, a Parameter-Efficient Fine-Tuning (PEFT) technique, along with DiceFocalLoss to fine-tune the EdgeSAM model. Our experimental results on public datasets and the climbing robot automatic inspections demonstrate that the system achieves high segmentation accuracy and significantly enhanced inference speed compared to the most recent methods. Notably, the system processes 1024 x 1024 pixels images at 46 FPS on our PC and 8 FPS on Jetson Orin Nano.

Problem

Research questions and friction points this paper is trying to address.

Efficient crack segmentation on edge devices

Overcoming computational limitations of CNN and SAM models

Enhancing structural health monitoring precision and efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

YOLOv8 model for self-prompting crack detection

LoRA-based fine-tuned SAM for crack segmentation

Crack Mask Refinement Module for enhanced accuracy

🔎 Similar Papers

Staircase Cascaded Fusion of Lightweight Local Pattern Recognition and Long-Range Dependencies for Structural Crack Segmentation