A Single Image Is All You Need: Zero-Shot Anomaly Localization Without Training Data

📅 2025-09-22

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

We address zero-shot image anomaly localization—localizing anomalous regions in a test image without access to any normal training samples. To this end, we propose the Single Shot Decomposition Network (SSDnet), the first method to incorporate deep image prior into zero-shot anomaly detection. SSDnet leverages self-supervised reconstruction from a single test image to model the intrinsic structural prior of normal appearance. To prevent trivial identity mapping, it employs block masking, spatial shuffling, and Gaussian noise perturbations. Furthermore, we introduce a perceptual loss based on inner-product similarity to enhance structural awareness. Evaluated on MVTec-AD and the textile dataset, SSDnet achieves 0.99/0.98 AUROC and 0.60/0.67 AUPRC, respectively—substantially surpassing state-of-the-art methods. Crucially, SSDnet requires no external data or normal exemplars, enabling truly data-free, single-image-driven anomaly localization with high precision.

Technology Category

Application Category

📝 Abstract

Anomaly detection in images is typically addressed by learning from collections of training data or relying on reference samples. In many real-world scenarios, however, such training data may be unavailable, and only the test image itself is provided. We address this zero-shot setting by proposing a single-image anomaly localization method that leverages the inductive bias of convolutional neural networks, inspired by Deep Image Prior (DIP). Our method is named Single Shot Decomposition Network (SSDnet). Our key assumption is that natural images often exhibit unified textures and patterns, and that anomalies manifest as localized deviations from these repetitive or stochastic patterns. To learn the deep image prior, we design a patch-based training framework where the input image is fed directly into the network for self-reconstruction, rather than mapping random noise to the image as done in DIP. To avoid the model simply learning an identity mapping, we apply masking, patch shuffling, and small Gaussian noise. In addition, we use a perceptual loss based on inner-product similarity to capture structure beyond pixel fidelity. Our approach needs no external training data, labels, or references, and remains robust in the presence of noise or missing pixels. SSDnet achieves 0.99 AUROC and 0.60 AUPRC on MVTec-AD and 0.98 AUROC and 0.67 AUPRC on the fabric dataset, outperforming state-of-the-art methods. The implementation code will be released at https://github.com/mehrdadmoradi124/SSDnet

Problem

Research questions and friction points this paper is trying to address.

Detecting anomalies in images without training data or reference samples

Localizing anomalies using only a single test image itself

Leveraging natural image patterns to identify deviations as anomalies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Single-image self-reconstruction using patch-based training

Masking and shuffling to prevent identity mapping

Perceptual loss based on inner-product similarity

🔎 Similar Papers

No similar papers found.

Authors to Follow