🤖 AI Summary
This work addresses the challenge of artifact detection in histopathology images, where unknown artifacts introduced during slide preparation and digitization severely compromise diagnostic reliability. Existing methods typically require extensive pixel-level annotations and exhibit limited generalization. To overcome these limitations, this study proposes the first unsupervised approach leveraging diffusion models for pathological artifact detection, treating artifacts as outliers deviating from the distribution of clean images. The model is trained exclusively on artifact-free images, eliminating the need for any artifact annotations or prior assumptions about artifact types. By incorporating a contrastive learning module, the method effectively enhances the separation between clean and artifact-affected image distributions. Extensive experiments demonstrate that the proposed approach significantly outperforms state-of-the-art methods across multiple datasets, substantially reducing reliance on labeled data while achieving strong cross-stain generalization capability.
📝 Abstract
Digital pathology plays a vital role across modern medicine, offering critical insights for disease diagnosis, prognosis, and treatment. However, histopathology images often contain artifacts introduced during slide preparation and digitization. Detecting and excluding them is essential to ensure reliable downstream analysis. Traditional supervised models typically require large annotated datasets, which is resource-intensive and not generalizable to novel artifact types. To address this, we propose DiffusionQC, which detects artifacts as outliers among clean images using a diffusion model. It requires only a set of clean images for training rather than pixel-level artifact annotations and predefined artifact types. Furthermore, we introduce a contrastive learning module to explicitly enlarge the distribution separation between artifact and clean images, yielding an enhanced version of our method. Empirical results demonstrate superior performance to state-of-the-art and offer cross-stain generalization capacity, with significantly less data and annotations.