🤖 AI Summary
CLIP suffers from insufficient discriminability between normal and anomalous features in zero-shot anomaly detection due to its lack of anomaly awareness. To address this, we propose an anomaly-aware dual-space enhancement framework: (1) constructing semantically separable abnormal text anchors to explicitly model anomalous semantics; and (2) designing a lightweight residual adapter that enables progressive, knowledge-preserving patch-level vision–language alignment. Our method requires no anomalous samples during training and preserves CLIP’s strong generalization capability. Evaluated on industrial defect and medical lesion detection benchmarks, it achieves state-of-the-art zero-shot performance—outperforming prior methods in both classification and precise localization—while maintaining low computational overhead. To our knowledge, this is the first systematic, anomaly-aware enhancement framework for adapting CLIP to zero-shot anomaly detection.
📝 Abstract
Anomaly detection (AD) identifies outliers for applications like defect and lesion detection. While CLIP shows promise for zero-shot AD tasks due to its strong generalization capabilities, its inherent Anomaly-Unawareness leads to limited discrimination between normal and abnormal features. To address this problem, we propose Anomaly-Aware CLIP (AA-CLIP), which enhances CLIP's anomaly discrimination ability in both text and visual spaces while preserving its generalization capability. AA-CLIP is achieved through a straightforward yet effective two-stage approach: it first creates anomaly-aware text anchors to differentiate normal and abnormal semantics clearly, then aligns patch-level visual features with these anchors for precise anomaly localization. This two-stage strategy, with the help of residual adapters, gradually adapts CLIP in a controlled manner, achieving effective AD while maintaining CLIP's class knowledge. Extensive experiments validate AA-CLIP as a resource-efficient solution for zero-shot AD tasks, achieving state-of-the-art results in industrial and medical applications. The code is available at https://github.com/Mwxinnn/AA-CLIP.