NexViTAD: Few-shot Unsupervised Cross-Domain Defect Detection via Vision Foundation Models and Multi-Task Learning

📅 2025-07-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address domain shift in industrial few-shot unsupervised cross-domain anomaly detection, this paper proposes NexViTAD—a novel framework integrating vision foundation models (Hiera/DINO-v2). It introduces a hierarchical adapter for parameter-efficient domain adaptation, a shared subspace projection to align feature distributions across domains, and a multi-task decoder for joint reconstruction and discrimination. Anomaly scoring is driven by Sinkhorn-K-means clustering, enhancing separation of normal and anomalous patterns. Feature robustness is improved via bottleneck-dimensional constraints, skip connections, and Gaussian filtering; pixel-level localization is achieved through adaptive thresholding. Evaluated on MVTec AD, NexViTAD achieves 97.5% AUC, 70.4% AP, and 95.2% PRO—substantially outperforming state-of-the-art methods. The framework establishes an efficient, transferable paradigm for cross-domain industrial defect detection.

Technology Category

Application Category

📝 Abstract
This paper presents a novel few-shot cross-domain anomaly detection framework, Nexus Vision Transformer for Anomaly Detection (NexViTAD), based on vision foundation models, which effectively addresses domain-shift challenges in industrial anomaly detection through innovative shared subspace projection mechanisms and multi-task learning (MTL) module. The main innovations include: (1) a hierarchical adapter module that adaptively fuses complementary features from Hiera and DINO-v2 pre-trained models, constructing more robust feature representations; (2) a shared subspace projection strategy that enables effective cross-domain knowledge transfer through bottleneck dimension constraints and skip connection mechanisms; (3) a MTL Decoder architecture supports simultaneous processing of multiple source domains, significantly enhancing model generalization capabilities; (4) an anomaly score inference method based on Sinkhorn-K-means clustering, combined with Gaussian filtering and adaptive threshold processing for precise pixel level. Valuated on the MVTec AD dataset, NexViTAD delivers state-of-the-art performance with an AUC of 97.5%, AP of 70.4%, and PRO of 95.2% in the target domains, surpassing other recent models, marking a transformative advance in cross-domain defect detection.
Problem

Research questions and friction points this paper is trying to address.

Addresses domain-shift in industrial anomaly detection
Enables cross-domain knowledge transfer via shared subspace
Improves defect detection with multi-task learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical adapter fuses Hiera and DINO-v2 features
Shared subspace projection enables cross-domain transfer
MTL Decoder processes multiple source domains simultaneously
🔎 Similar Papers
No similar papers found.
T
Tianwei Mu
Guangzhou Institute of Industrial Intelligence, Guangzhou
Feiyu Duan
Feiyu Duan
Beihang University
natural language processing
B
Bo Zhou
School of Architecture & Civil Engineering, Shenyang University of Technology, Shenyang
D
Dan Xue
School of Information Science and Engineering, Shenyang University of Technology, Shenyang
M
Manhong Huang
College of Environmental Science and Engineering, Donghua University, Shanghai