A Single Image Is All You Need: Zero-Shot Anomaly Localization Without Training Data

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
We address zero-shot image anomaly localization—localizing anomalous regions in a test image without access to any normal training samples. To this end, we propose the Single Shot Decomposition Network (SSDnet), the first method to incorporate deep image prior into zero-shot anomaly detection. SSDnet leverages self-supervised reconstruction from a single test image to model the intrinsic structural prior of normal appearance. To prevent trivial identity mapping, it employs block masking, spatial shuffling, and Gaussian noise perturbations. Furthermore, we introduce a perceptual loss based on inner-product similarity to enhance structural awareness. Evaluated on MVTec-AD and the textile dataset, SSDnet achieves 0.99/0.98 AUROC and 0.60/0.67 AUPRC, respectively—substantially surpassing state-of-the-art methods. Crucially, SSDnet requires no external data or normal exemplars, enabling truly data-free, single-image-driven anomaly localization with high precision.

Technology Category

Application Category

📝 Abstract
Anomaly detection in images is typically addressed by learning from collections of training data or relying on reference samples. In many real-world scenarios, however, such training data may be unavailable, and only the test image itself is provided. We address this zero-shot setting by proposing a single-image anomaly localization method that leverages the inductive bias of convolutional neural networks, inspired by Deep Image Prior (DIP). Our method is named Single Shot Decomposition Network (SSDnet). Our key assumption is that natural images often exhibit unified textures and patterns, and that anomalies manifest as localized deviations from these repetitive or stochastic patterns. To learn the deep image prior, we design a patch-based training framework where the input image is fed directly into the network for self-reconstruction, rather than mapping random noise to the image as done in DIP. To avoid the model simply learning an identity mapping, we apply masking, patch shuffling, and small Gaussian noise. In addition, we use a perceptual loss based on inner-product similarity to capture structure beyond pixel fidelity. Our approach needs no external training data, labels, or references, and remains robust in the presence of noise or missing pixels. SSDnet achieves 0.99 AUROC and 0.60 AUPRC on MVTec-AD and 0.98 AUROC and 0.67 AUPRC on the fabric dataset, outperforming state-of-the-art methods. The implementation code will be released at https://github.com/mehrdadmoradi124/SSDnet
Problem

Research questions and friction points this paper is trying to address.

Detecting anomalies in images without training data or reference samples
Localizing anomalies using only a single test image itself
Leveraging natural image patterns to identify deviations as anomalies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Single-image self-reconstruction using patch-based training
Masking and shuffling to prevent identity mapping
Perceptual loss based on inner-product similarity
🔎 Similar Papers
No similar papers found.
M
Mehrdad Moradi
Georgia Tech
S
Shengzhe Chen
Arizona State University
H
Hao Yan
Arizona State University
Kamran Paynabar
Kamran Paynabar
Unknown affiliation