Unified Unsupervised Anomaly Detection via Matching Cost Filtering

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Unsupervised anomaly detection (UAD) faces three key challenges: scarcity of anomalous samples, matching noise in feature correspondence, and fragmented treatment of single- versus multi-modal data. This paper proposes the first unified, matching-centric framework for multi-modal UAD—supporting RGB, RGB-3D, and RGB-Text modalities—grounded in a learnable matching cost filtering mechanism. Specifically, we construct an anomaly cost volume and introduce a multi-layer attention-guided filtering module that adaptively denoises inter-sample matches (both intra- and cross-modal) while amplifying subtle anomalies. Our method is backbone-agnostic and plug-and-play, requiring no architectural modifications to the underlying feature extractor. It effectively suppresses matching noise and enhances sensitivity to fine-grained anomalies. Extensive evaluation across 22 benchmarks consistently establishes new state-of-the-art performance for both single-modal and multi-modal UAD.

Technology Category

Application Category

📝 Abstract

Unsupervised anomaly detection (UAD) aims to identify image- and pixel-level anomalies using only normal training data, with wide applications such as industrial inspection and medical analysis, where anomalies are scarce due to privacy concerns and cold-start constraints. Existing methods, whether reconstruction-based (restoring normal counterparts) or embedding-based (pretrained representations), fundamentally conduct image- or feature-level matching to generate anomaly maps. Nonetheless, matching noise has been largely overlooked, limiting their detection ability. Beyond earlier focus on unimodal RGB-based UAD, recent advances expand to multimodal scenarios, e.g., RGB--3D and RGB--Text, enabled by point cloud sensing and vision--language models. Despite shared challenges, these lines remain largely isolated, hindering a comprehensive understanding and knowledge transfer. In this paper, we advocate unified UAD for both unimodal and multimodal settings in the matching perspective. Under this insight, we present Unified Cost Filtering (UCF), a generic post-hoc refinement framework for refining anomaly cost volume of any UAD model. The cost volume is constructed by matching a test sample against normal samples from the same or different modalities, followed by a learnable filtering module with multi-layer attention guidance from the test sample, mitigating matching noise and highlighting subtle anomalies. Comprehensive experiments on 22 diverse benchmarks demonstrate the efficacy of UCF in enhancing a variety of UAD methods, consistently achieving new state-of-the-art results in both unimodal (RGB) and multimodal (RGB--3D, RGB--Text) UAD scenarios. Code and models will be released at https://github.com/ZHE-SAPI/CostFilter-AD.

Problem

Research questions and friction points this paper is trying to address.

Addressing matching noise in unsupervised anomaly detection methods

Unifying anomaly detection across unimodal and multimodal scenarios

Refining anomaly cost volumes through learnable filtering modules

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified Cost Filtering framework for anomaly detection

Post-hoc refinement of anomaly cost volume

Multi-layer attention guidance reduces matching noise

🔎 Similar Papers

Anomaly Detection by Context Contrasting