π€ AI Summary
This work proposes a novel approach for anomaly localization in medical images without requiring pixel-level annotations or fine-tuning of pretrained models. Leveraging frozen DINOv2 self-supervised features, the method models the distribution of normal regions through semantic-aligned support image selection and foreground-aware K-means clustering. High-precision anomaly maps are generated by combining embedding similarity matching with cosine similarity. Evaluated on Brain and Liver datasets, the approach achieves a peak AUROC of 98.71, substantially outperforming existing methods and yielding clearer, more accurate localization of anomalous regions.
π Abstract
Unsupervised anomaly detection (AD) in medical images aims to identify abnormal regions without relying on pixel-level annotations, which is crucial for scalable and label-efficient diagnostic systems. In this paper, we propose a novel anomaly detection framework based on DINO-V3 representations, termed DINO-AD, which leverages self-supervised visual features for precise and interpretable anomaly localization. Specifically, we introduce an embedding similarity matching strategy to select a semantically aligned support image and a foreground-aware K-means clustering module to model the distribution of normal features. Anomaly maps are then computed by comparing the query features with clustered normal embeddings through cosine similarity. Experimental results on both the Brain and Liver datasets demonstrate that our method achieves superior quantitative performance compared with state-of-the-art approaches, achieving AUROC scores of up to 98.71. Qualitative results further confirm that our framework produces clearer and more accurate anomaly localization. Extensive ablation studies validate the effectiveness of each proposed component, highlighting the robustness and generalizability of our approach.