🤖 AI Summary
Industrial anomaly detection suffers from severe scarcity of authentic anomaly samples, leading to suboptimal performance in both localization and classification. To address this, we propose a region-guided few-shot anomaly image-mask pair generation framework built upon a pre-trained latent diffusion model. Our method innovatively integrates local concept decomposition with adaptive multi-round anomaly clustering to enable controllable generation of anomaly types and locations while enhancing semantic consistency. A region-guided mask generation mechanism ensures pixel-level alignment between synthesized anomalies and their corresponding masks. Additionally, we introduce a low-quality sample filtering strategy to improve synthesis reliability. Extensive experiments on MVTec AD and LOCO demonstrate that our generated anomalies exhibit high photorealism and precise localization, consistently outperforming state-of-the-art methods in downstream anomaly localization and classification tasks.
📝 Abstract
Anomaly inspection plays a vital role in industrial manufacturing, but the scarcity of anomaly samples significantly limits the effectiveness of existing methods in tasks such as localization and classification. While several anomaly synthesis approaches have been introduced for data augmentation, they often struggle with low realism, inaccurate mask alignment, and poor generalization. To overcome these limitations, we propose Generate Aligned Anomaly (GAA), a region-guided, few-shot anomaly image-mask pair generation framework. GAA leverages the strong priors of a pretrained latent diffusion model to generate realistic, diverse, and semantically aligned anomalies using only a small number of samples. The framework first employs Localized Concept Decomposition to jointly model the semantic features and spatial information of anomalies, enabling flexible control over the type and location of anomalies. It then utilizes Adaptive Multi-Round Anomaly Clustering to perform fine-grained semantic clustering of anomaly concepts, thereby enhancing the consistency of anomaly representations. Subsequently, a region-guided mask generation strategy ensures precise alignment between anomalies and their corresponding masks, while a low-quality sample filtering module is introduced to further improve the overall quality of the generated samples. Extensive experiments on the MVTec AD and LOCO datasets demonstrate that GAA achieves superior performance in both anomaly synthesis quality and downstream tasks such as localization and classification.