🤖 AI Summary
Existing training-free foundation models for tunnel defect detection offer only coarse-grained semantic suggestions, struggling with strong real-world interference and failing to meet engineering requirements for precise localization, measurement, severity grading, and documentation. This work proposes TunnelMIND, a framework that, during inference, spatially recalibrates language-guided, open-vocabulary defect proposals through dense visual consistency and reformulates them into structured defect entities encompassing category, location, geometry, severity, and contextual attributes. By integrating expert knowledge constraints, the framework generates interpretable engineering reports. To the best of our knowledge, this is the first approach to achieve end-to-end generation of structured engineering evidence from tunnel defect detection without any training. It attains F1 scores of 0.68, 0.78, and 0.72 on visible-light, ground-penetrating radar (GPR), and road defect tasks, respectively, demonstrating its multimodal effectiveness.
📝 Abstract
Tunnel inspection requires outputs that can support defect localization, measurement, severity grading, and engineering documentation. Existing training-free foundation-model pipelines usually stop at coarse open-vocabulary proposals, which are difficult to use directly in interference-heavy tunnel scenes. We propose a training-free framework TunnelMIND. Specifically, language-guided defect proposals are not treated as final outputs; instead, their spatial support is recalibrated at inference time through dense visual consistency, so that coarse semantic anchors can be transformed into more reliable prompts under tunnel-specific hard negatives. The resulting masks are further reconstructed into structured defect entities with category, location, geometry, severity, and context attributes, which are then mapped to retrieval-grounded explanation and engineering-readable report generation under expert knowledge constraints. On visible, GPR, and road defect tasks, TunnelMIND achieves F1 scores of 0.68, 0.78, and 0.72, respectively. Overall, TunnelMIND shows that training-free tunnel inspection can move beyond coarse localization toward structured defect evidence for engineering assessment.