MRAD: Zero-Shot Anomaly Detection with Memory-Driven Retrieval

📅 2026-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the high computational cost and limited cross-domain generalization in zero-shot anomaly detection caused by prompt learning or complex modeling. The authors propose MRAD, a novel training-free, memory-driven retrieval framework that constructs dual-level (image- and pixel-wise) memory banks and directly leverages similarity search with a frozen CLIP encoder to generate anomaly scores, thereby avoiding parameterized fitting. To further enhance discriminability and generalization, two lightweight variants are introduced: MRAD-FT, which applies linear fine-tuning, and MRAD-CLIP, which injects dynamic textual prompt bias. Extensive experiments across 16 industrial and medical datasets demonstrate that MRAD achieves state-of-the-art performance in both anomaly classification and segmentation, validating the efficacy of leveraging empirical data distributions for efficient zero-shot anomaly detection.

Technology Category

Application Category

📝 Abstract
Zero-shot anomaly detection (ZSAD) often leverages pretrained vision or vision-language models, but many existing methods use prompt learning or complex modeling to fit the data distribution, resulting in high training or inference cost and limited cross-domain stability. To address these limitations, we propose Memory-Retrieval Anomaly Detection method (MRAD), a unified framework that replaces parametric fitting with a direct memory retrieval. The train-free base model, MRAD-TF, freezes the CLIP image encoder and constructs a two-level memory bank (image-level and pixel-level) from auxiliary data, where feature-label pairs are explicitly stored as keys and values. During inference, anomaly scores are obtained directly by similarity retrieval over the memory bank. Based on the MRAD-TF, we further propose two lightweight variants as enhancements: (i) MRAD-FT fine-tunes the retrieval metric with two linear layers to enhance the discriminability between normal and anomaly; (ii) MRAD-CLIP injects the normal and anomalous region priors from the MRAD-FT as dynamic biases into CLIP's learnable text prompts, strengthening generalization to unseen categories. Across 16 industrial and medical datasets, the MRAD framework consistently demonstrates superior performance in anomaly classification and segmentation, under both train-free and training-based settings. Our work shows that fully leveraging the empirical distribution of raw data, rather than relying only on model fitting, can achieve stronger anomaly detection performance. The code will be publicly released at https://github.com/CROVO1026/MRAD.
Problem

Research questions and friction points this paper is trying to address.

zero-shot anomaly detection
cross-domain stability
training cost
inference cost
data distribution fitting
Innovation

Methods, ideas, or system contributions that make the work stand out.

zero-shot anomaly detection
memory retrieval
CLIP
train-free framework
anomaly segmentation
🔎 Similar Papers
No similar papers found.
C
Chaoran Xu
Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
C
Chengkan Lv
Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
Qiyu Chen
Qiyu Chen
Institute of Automation, Chinese Academy of Sciences
Anomaly DetectionComputer VisionDeep Learning
F
Feng Zhang
Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China
Z
Zhengtao Zhang
Institute of Automation, Chinese Academy of Sciences, Beijing, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China