AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIP

📅 2025-03-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
CLIP suffers from insufficient discriminability between normal and anomalous features in zero-shot anomaly detection due to its lack of anomaly awareness. To address this, we propose an anomaly-aware dual-space enhancement framework: (1) constructing semantically separable abnormal text anchors to explicitly model anomalous semantics; and (2) designing a lightweight residual adapter that enables progressive, knowledge-preserving patch-level vision–language alignment. Our method requires no anomalous samples during training and preserves CLIP’s strong generalization capability. Evaluated on industrial defect and medical lesion detection benchmarks, it achieves state-of-the-art zero-shot performance—outperforming prior methods in both classification and precise localization—while maintaining low computational overhead. To our knowledge, this is the first systematic, anomaly-aware enhancement framework for adapting CLIP to zero-shot anomaly detection.

Technology Category

Application Category

📝 Abstract
Anomaly detection (AD) identifies outliers for applications like defect and lesion detection. While CLIP shows promise for zero-shot AD tasks due to its strong generalization capabilities, its inherent Anomaly-Unawareness leads to limited discrimination between normal and abnormal features. To address this problem, we propose Anomaly-Aware CLIP (AA-CLIP), which enhances CLIP's anomaly discrimination ability in both text and visual spaces while preserving its generalization capability. AA-CLIP is achieved through a straightforward yet effective two-stage approach: it first creates anomaly-aware text anchors to differentiate normal and abnormal semantics clearly, then aligns patch-level visual features with these anchors for precise anomaly localization. This two-stage strategy, with the help of residual adapters, gradually adapts CLIP in a controlled manner, achieving effective AD while maintaining CLIP's class knowledge. Extensive experiments validate AA-CLIP as a resource-efficient solution for zero-shot AD tasks, achieving state-of-the-art results in industrial and medical applications. The code is available at https://github.com/Mwxinnn/AA-CLIP.
Problem

Research questions and friction points this paper is trying to address.

Enhance CLIP for zero-shot anomaly detection
Improve discrimination between normal and abnormal features
Achieve precise anomaly localization in text and visual spaces
Innovation

Methods, ideas, or system contributions that make the work stand out.

Anomaly-Aware CLIP enhances zero-shot anomaly detection
Two-stage approach with anomaly-aware text anchors
Patch-level visual feature alignment for precise localization
🔎 Similar Papers
2023-10-29International Conference on Learning RepresentationsCitations: 114
Wenxin Ma
Wenxin Ma
University of Science and Technology of China
AIcomputer vision
X
Xu Zhang
School of Biomedical Engineering, Division of Life Sciences and Medicine, USTC; MIRACLE Center, Suzhou Institute for Advance Research, USTC
Qingsong Yao
Qingsong Yao
Stanford University | ICT, CAS
Medical Image ComputingMedical Image Analysis
Fenghe Tang
Fenghe Tang
University of Science and Technology of China
Medical Image AnalysisFoundation model
Chenxu Wu
Chenxu Wu
USTC
diffusion-based methods,multimodal learning
Yingtai Li
Yingtai Li
University of Science & Technology of China
R
Rui Yan
School of Biomedical Engineering, Division of Life Sciences and Medicine, USTC; MIRACLE Center, Suzhou Institute for Advance Research, USTC
Zihang Jiang
Zihang Jiang
School of Biomedical Engineering, USTC, Suzhou Institute for Advanced Research
Computer VisionMedical Imaging3D
S
S.Kevin Zhou
School of Biomedical Engineering, Division of Life Sciences and Medicine, USTC; MIRACLE Center, Suzhou Institute for Advance Research, USTC; Key Laboratory of Intelligent Information Processing of CAS, ICT, CAS; State Key Laboratory of Precision and Intelligent Chemistry, USTC