Self-Supervised Visual Prompting for Cross-Domain Road Damage Detection

📅 2025-11-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Weak cross-domain generalization severely hinders the practical deployment of automated pavement defect detection. To address this, we propose the first unsupervised self-supervised visual prompting framework tailored for road damage detection: it generates defect-aware visual prompts on unlabeled target-domain images to guide representation adaptation of a frozen Vision Transformer (ViT) backbone. We introduce two key innovations—the Self-supervised Prompt Enhancement Module (SPEM) and the Domain-Aware Prompt Alignment (DAPA) strategy—enabling focused defect feature learning and cross-domain representation alignment. Our method requires no annotations, learning prompts solely from target-domain imagery. Evaluated on four benchmark datasets, it achieves superior zero-shot transfer performance over state-of-the-art supervised, self-supervised, and domain-adaptation methods, while significantly improving few-shot adaptation efficiency and cross-domain robustness.

Technology Category

Application Category

📝 Abstract
The deployment of automated pavement defect detection is often hindered by poor cross-domain generalization. Supervised detectors achieve strong in-domain accuracy but require costly re-annotation for new environments, while standard self-supervised methods capture generic features and remain vulnerable to domain shift. We propose ours, a self-supervised framework that emph{visually probes} target domains without labels. ours introduces a Self-supervised Prompt Enhancement Module (SPEM), which derives defect-aware prompts from unlabeled target data to guide a frozen ViT backbone, and a Domain-Aware Prompt Alignment (DAPA) objective, which aligns prompt-conditioned source and target representations. Experiments on four challenging benchmarks show that ours consistently outperforms strong supervised, self-supervised, and adaptation baselines, achieving robust zero-shot transfer, improved resilience to domain variations, and high data efficiency in few-shot adaptation. These results highlight self-supervised prompting as a practical direction for building scalable and adaptive visual inspection systems. Source code is publicly available: https://github.com/xixiaouab/PROBE/tree/main
Problem

Research questions and friction points this paper is trying to address.

Addressing poor cross-domain generalization in automated pavement defect detection
Eliminating costly re-annotation requirements for new environmental domains
Improving resilience to domain shift in visual inspection systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised prompting for cross-domain defect detection
Defect-aware prompts derived from unlabeled target data
Domain-aligned prompt conditioning for robust transfer
🔎 Similar Papers
No similar papers found.
Xi Xiao
Xi Xiao
Oak Ridge National Laboratory | University of Alabama at Birmingham
LLM / MLLM EfficiencyImage / Video GenerationImage / Video Understanding
Z
Zhuxuanzi Wang
Cornell University, Ithaca, NY , USA
M
Mingqiao Mo
Cornell University, Ithaca, NY , USA
C
Chen Liu
Yale University, New Haven, CT, USA
C
Chenrui Ma
University of California Irvine, Irvine, CA, USA
Yanshu Li
Yanshu Li
Brown University
NLPMultimodal Learning
Smita Krishnaswamy
Smita Krishnaswamy
Yale University
Machine LearningData MiningManifold LearningDeep LearningComputational Biology
X
Xiao Wang
Oak Ridge National Laboratory, Oak Ridge, TN, USA
Tianyang Wang
Tianyang Wang
University of Alabama at Birmingham
machine learning (deep learning)computer vision