RAID: Retrieval-Augmented Anomaly Detection

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge in unsupervised anomaly detection where matching test images to normal templates is prone to noise induced by intra-class variation, misalignment, and limited template representativeness. To mitigate these issues, the authors propose RAID, a novel framework that introduces retrieval-augmented mechanisms into this task for the first time. RAID employs a hierarchical vector database to retrieve normal samples at category, semantic, and instance levels, establishing a coarse-to-fine multi-stage retrieval system. It further integrates a matching cost volume with a guided Mixture-of-Experts network to dynamically suppress matching noise during anomaly map generation. Extensive experiments demonstrate that RAID achieves state-of-the-art performance across multiple benchmarks—including MVTec, VisA, MPDD, and BTAD—under full-data, few-shot, and cross-dataset settings, significantly enhancing the robustness of both anomaly detection and localization.

Technology Category

Application Category

📝 Abstract
Unsupervised Anomaly Detection (UAD) aims to identify abnormal regions by establishing correspondences between test images and normal templates. Existing methods primarily rely on image reconstruction or template retrieval but face a fundamental challenge: matching between test images and normal templates inevitably introduces noise due to intra-class variations, imperfect correspondences, and limited templates. Observing that Retrieval-Augmented Generation (RAG) leverages retrieved samples directly in the generation process, we reinterpret UAD through this lens and introduce \textbf{RAID}, a retrieval-augmented UAD framework designed for noise-resilient anomaly detection and localization. Unlike standard RAG that enriches context or knowledge, we focus on using retrieved normal samples to guide noise suppression in anomaly map generation. RAID retrieves class-, semantic-, and instance-level representations from a hierarchical vector database, forming a coarse-to-fine pipeline. A matching cost volume correlates the input with retrieved exemplars, followed by a guided Mixture-of-Experts (MoE) network that leverages the retrieved samples to adaptively suppress matching noise and produce fine-grained anomaly maps. RAID achieves state-of-the-art performance across full-shot, few-shot, and multi-dataset settings on MVTec, VisA, MPDD, and BTAD benchmarks. \href{https://github.com/Mingxiu-Cai/RAID}{https://github.com/Mingxiu-Cai/RAID}.
Problem

Research questions and friction points this paper is trying to address.

Unsupervised Anomaly Detection
template retrieval
matching noise
anomaly localization
intra-class variations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Anomaly Detection
Unsupervised Anomaly Detection
Mixture-of-Experts
Noise-Resilient Localization
Hierarchical Retrieval
🔎 Similar Papers
2024-05-29arXiv.orgCitations: 0
M
Mingxiu Cai
State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University
Z
Zhe Zhang
State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University
G
Gaochang Wu
State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University
Tianyou Chai
Tianyou Chai
Northeastern University China
modelingcontroloptimizationintegrated automation of industrial processesadaptive control
Xiatian Zhu
Xiatian Zhu
University of Surrey
Machine LearningComputer Vision