Prompt-Driven Lightweight Foundation Model for Instance Segmentation-Based Fault Detection in Freight Trains

šŸ“… 2026-03-12
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
This work addresses the challenges of weak generalization and low boundary accuracy in visual fault detection for freight trains, which arise from repetitive components, occlusions, and contamination. To overcome these issues, the authors propose a lightweight self-prompting instance segmentation framework built upon the Segment Anything Model (SAM). The approach introduces an innovative self-prompt generation mechanism to enable efficient knowledge transfer from the foundation model to the specific domain, while employing a Tiny Vision Transformer backbone to facilitate real-time deployment on edge devices. Evaluated on a newly curated real-world freight train dataset, the model achieves 74.6 AP⁢ᵇᵒˣ and 74.2 APā¢įµįµƒĖ¢įµ, outperforming current state-of-the-art methods and striking an effective balance among accuracy, robustness, and computational efficiency.

Technology Category

Application Category

šŸ“ Abstract
Accurate visual fault detection in freight trains remains a critical challenge for intelligent transportation system maintenance, due to complex operational environments, structurally repetitive components, and frequent occlusions or contaminations in safety-critical regions. Conventional instance segmentation methods based on convolutional neural networks and Transformers often suffer from poor generalization and limited boundary accuracy under such conditions. To address these challenges, we propose a lightweight self-prompted instance segmentation framework tailored for freight train fault detection. Our method leverages the Segment Anything Model by introducing a self-prompt generation module that automatically produces task-specific prompts, enabling effective knowledge transfer from foundation models to domain-specific inspection tasks. In addition, we adopt a Tiny Vision Transformer backbone to reduce computational cost, making the framework suitable for real-time deployment on edge devices in railway monitoring systems. We construct a domain-specific dataset collected from real-world freight inspection stations and conduct extensive evaluations. Experimental results show that our method achieves 74.6 $AP^{\text{box}}$ and 74.2 $AP^{\text{mask}}$ on the dataset, outperforming existing state-of-the-art methods in both accuracy and robustness while maintaining low computational overhead. This work offers a deployable and efficient vision solution for automated freight train inspection, demonstrating the potential of foundation model adaptation in industrial-scale fault diagnosis scenarios. Project page: https://github.com/MVME-HBUT/SAM_FTI-FDet.git
Problem

Research questions and friction points this paper is trying to address.

instance segmentation
fault detection
freight trains
occlusions
visual inspection
Innovation

Methods, ideas, or system contributions that make the work stand out.

self-prompting
lightweight foundation model
instance segmentation
fault detection
Tiny Vision Transformer
šŸ”Ž Similar Papers
No similar papers found.
Guodong Sun
Guodong Sun
INRIA
wireless communicationresource managementprobability theory
Q
Qihang Liang
School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; Hubei Key Laboratory of Modern Manufacturing Quality Engineering, Hubei University of Technology, Wuhan 430068, China
X
Xingyu Pan
School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; Hubei Key Laboratory of Modern Manufacturing Quality Engineering, Hubei University of Technology, Wuhan 430068, China
Moyun Liu
Moyun Liu
Huazhong University of Science and Technology
Embodied AIComputer Vision
Y
Yang Zhang
School of Mechanical Engineering, Hubei University of Technology, Wuhan 430068, China; Hubei Key Laboratory of Modern Manufacturing Quality Engineering, Hubei University of Technology, Wuhan 430068, China