🤖 AI Summary
This work addresses the challenge of medical image anomaly detection, which is often hindered by the absence of real anomalous samples during training. To overcome this limitation, the authors propose a multi-task self-supervised learning framework that does not rely on large-scale pretraining. Their approach uniquely integrates a Mixture-of-Experts (MoE) architecture with multiple proxy tasks—including pseudo-labeling—and jointly optimizes them during training. At inference, the model dynamically evaluates the performance of each task to generate an anomaly score. Evaluated on the BMAD multimodal medical imaging benchmark, the method significantly outperforms current state-of-the-art approaches and produces interpretable anomaly heatmaps that effectively support clinical diagnosis.
📝 Abstract
Anomaly detection in medical images is a challenging task, since anomalies are not typically available during training. Recent methods leverage a single pretext task coupled with a large-scale pre-trained model to reach state-of-the-art performance. Instead, we propose to learn multiple self-supervised and pseudo-labeling tasks from scratch, using a joint model based on Mixture-of-Experts (MoE). By carefully integrating multiple proxy tasks, the joint model effectively learns a robust representation of normal anatomical structures, so that anomaly scores can be derived based on how well the multi-task learner (MTL) solves each task during inference. We perform comprehensive experiments on BMAD, a recent benchmark that comprises a broad range of medical image modalities. The empirical results indicate that our multi-task learner is an effective anomaly detector, outperforming all state-of-the-art competitors on BMAD. Moreover, our model produces interpretable anomaly maps, potentially helping physicians in providing more accurate diagnoses.