Exploring Multimodal Prompts For Unsupervised Continuous Anomaly Detection

📅 2026-03-23

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the limitations of existing unsupervised continual anomaly detection methods, which rely solely on a single visual modality and struggle to accurately model the normal pattern manifold in complex scenes, thereby constraining detection performance. To overcome this, we propose the first multimodal prompt-based unsupervised continual anomaly detection framework. Our approach introduces a Continually updated Multimodal Prompt Memory Bank (CMPMB), complemented by a defect semantics-guided Adaptive Normalization Module (ANM) and a Dynamic Fusion Strategy (DFS), enabling effective collaboration among multimodal cues. By transcending the constraints of unimodal modeling, the proposed method achieves state-of-the-art performance in both image-level AUROC and pixel-level AUPR on the MVTec AD and VisA benchmarks, while simultaneously enhancing adversarial robustness.

Technology Category

Application Category

📝 Abstract

Unsupervised Continuous Anomaly Detection (UCAD) is gaining attention for effectively addressing the catastrophic forgetting and heavy computational burden issues in traditional Unsupervised Anomaly Detection (UAD). However, existing UCAD approaches that rely solely on visual information are insufficient to capture the manifold of normality in complex scenes, thereby impeding further gains in anomaly detection accuracy. To overcome this limitation, we propose an unsupervised continual anomaly detection framework grounded in multimodal prompting. Specifically, we introduce a Continual Multimodal Prompt Memory Bank (CMPMB) that progressively distills and retains prototypical normal patterns from both visual and textual domains across consecutive tasks, yielding a richer representation of normality. Furthermore, we devise a Defect-Semantic-Guided Adaptive Fusion Mechanism (DSG-AFM) that integrates an Adaptive Normalization Module (ANM) with a Dynamic Fusion Strategy (DFS) to jointly enhance detection accuracy and adversarial robustness. Benchmark experiments on MVTec AD and VisA datasets show that our approach achieves state-of-the-art (SOTA) performance on image-level AUROC and pixel-level AUPR metrics.

Problem

Research questions and friction points this paper is trying to address.

Unsupervised Continuous Anomaly Detection

Multimodal Prompts

Normality Representation

Catastrophic Forgetting

Anomaly Detection Accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Prompting

Continual Anomaly Detection

Prompt Memory Bank