MedIAnomaly: A comparative study of anomaly detection in medical images

📅 2024-04-06

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Medical image anomaly detection (AD) has long suffered from a lack of fair and comprehensive evaluation benchmarks, hindering reproducibility, comparability, and progress. To address this, we introduce the first unified benchmark for AD, encompassing seven diverse datasets, five imaging modalities, and thirty state-of-the-art methods, with systematic evaluation at both image-level classification and pixel-level segmentation tasks. For the first time, we conduct a component-level analysis of reconstruction-based models (e.g., VAEs, GANs), self-supervised learning approaches (e.g., DINO, MAE), and emerging vision representation methods. Our analysis reveals critical bottlenecks: poor cross-modal generalization, difficulty in localizing anomalies under limited supervision, and suboptimal fine-grained segmentation performance on histopathological slides. We publicly release all data, code, and a standardized evaluation protocol, establishing new state-of-the-art baselines. This benchmark significantly enhances reproducibility, fairness, and rigor, providing a foundational resource for future AD research.

Technology Category

Application Category

📝 Abstract

Anomaly detection (AD) aims at detecting abnormal samples that deviate from the expected normal patterns. Generally, it can be trained merely on normal data, without a requirement for abnormal samples, and thereby plays an important role in the recognition of rare diseases and health screening in the medical domain. Despite the emergence of numerous methods for medical AD, we observe a lack of a fair and comprehensive evaluation, which causes ambiguous conclusions and hinders the development of this field. To address this problem, this paper builds a benchmark with unified comparison. Seven medical datasets with five image modalities, including chest X-rays, brain MRIs, retinal fundus images, dermatoscopic images, and histopathology whole slide images, are curated for extensive evaluation. Thirty typical AD methods, including reconstruction and self-supervised learning-based methods, are involved in comparison of image-level anomaly classification and pixel-level anomaly segmentation. Furthermore, for the first time, we formally explore the effect of key components in existing methods, clearly revealing unresolved challenges and potential future directions. The datasets and code are available at https://github.com/caiyu6666/MedIAnomaly.

Problem

Research questions and friction points this paper is trying to address.

Anomaly detection in medical images

Lack of comprehensive evaluation methods

Comparison of thirty AD methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmark with unified comparison

Seven medical datasets evaluated

Thirty AD methods compared

🔎 Similar Papers

Anomaly Detection by Context Contrasting