Exploring Multimodal Prompts For Unsupervised Continuous Anomaly Detection

📅 2026-03-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing unsupervised continual anomaly detection methods, which rely solely on a single visual modality and struggle to accurately model the normal pattern manifold in complex scenes, thereby constraining detection performance. To overcome this, we propose the first multimodal prompt-based unsupervised continual anomaly detection framework. Our approach introduces a Continually updated Multimodal Prompt Memory Bank (CMPMB), complemented by a defect semantics-guided Adaptive Normalization Module (ANM) and a Dynamic Fusion Strategy (DFS), enabling effective collaboration among multimodal cues. By transcending the constraints of unimodal modeling, the proposed method achieves state-of-the-art performance in both image-level AUROC and pixel-level AUPR on the MVTec AD and VisA benchmarks, while simultaneously enhancing adversarial robustness.

Technology Category

Application Category

📝 Abstract
Unsupervised Continuous Anomaly Detection (UCAD) is gaining attention for effectively addressing the catastrophic forgetting and heavy computational burden issues in traditional Unsupervised Anomaly Detection (UAD). However, existing UCAD approaches that rely solely on visual information are insufficient to capture the manifold of normality in complex scenes, thereby impeding further gains in anomaly detection accuracy. To overcome this limitation, we propose an unsupervised continual anomaly detection framework grounded in multimodal prompting. Specifically, we introduce a Continual Multimodal Prompt Memory Bank (CMPMB) that progressively distills and retains prototypical normal patterns from both visual and textual domains across consecutive tasks, yielding a richer representation of normality. Furthermore, we devise a Defect-Semantic-Guided Adaptive Fusion Mechanism (DSG-AFM) that integrates an Adaptive Normalization Module (ANM) with a Dynamic Fusion Strategy (DFS) to jointly enhance detection accuracy and adversarial robustness. Benchmark experiments on MVTec AD and VisA datasets show that our approach achieves state-of-the-art (SOTA) performance on image-level AUROC and pixel-level AUPR metrics.
Problem

Research questions and friction points this paper is trying to address.

Unsupervised Continuous Anomaly Detection
Multimodal Prompts
Normality Representation
Catastrophic Forgetting
Anomaly Detection Accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Prompting
Continual Anomaly Detection
Prompt Memory Bank
Adaptive Fusion Mechanism
Unsupervised Learning
🔎 Similar Papers
No similar papers found.
M
Mingle Zhou
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
Jiahui Liu
Jiahui Liu
Fujitsu Research of America
Quantum ComputingCryptographyQuantum Cryptography
Jin Wan
Jin Wan
Associate Professor of Computer Science and Technology, Qilu University of Technology
Computer visionMachine learning
G
Gang Li
Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China; Faculty of Data Science, City University of Macau, Macau, China
M
Min Li
Faculty of Data Science, City University of Macau, Macau, China; Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan, China