🤖 AI Summary
To address poor generalization and weak robustness in lesion detection for medical imaging—particularly dense-breast mammography—this paper proposes the first multimodal contrastive detection framework leveraging class-specific exemplar features. Methodologically, it introduces learnable exemplar features to guide cross-modal lesion localization, incorporates cross-attention mechanisms to model complex anatomical–abnormality relationships, and employs an iterative optimization training strategy to enhance out-of-distribution generalization. The framework unifies support for multiple modalities, including mammography, chest X-ray, and angiography. It achieves state-of-the-art performance on four public benchmarks. On a Vietnamese dense-breast dataset, it attains mAP of 0.70 for quality assessment and 0.55 for microcalcification detection. External validation on a Chinese cohort demonstrates a 100% improvement in detection efficiency.
📝 Abstract
Detecting abnormalities in medical images poses unique challenges due to differences in feature representations and the intricate relationship between anatomical structures and abnormalities. This is especially evident in mammography, where dense breast tissue can obscure lesions, complicating radiological interpretation. Despite leveraging anatomical and semantic context, existing detection methods struggle to learn effective class-specific features, limiting their applicability across different tasks and imaging modalities. In this work, we introduce Exemplar Med-DETR, a novel multi-modal contrastive detector that enables feature-based detection. It employs cross-attention with inherently derived, intuitive class-specific exemplar features and is trained with an iterative strategy. We achieve state-of-the-art performance across three distinct imaging modalities from four public datasets. On Vietnamese dense breast mammograms, we attain an mAP of 0.7 for mass detection and 0.55 for calcifications, yielding an absolute improvement of 16 percentage points. Additionally, a radiologist-supported evaluation of 100 mammograms from an out-of-distribution Chinese cohort demonstrates a twofold gain in lesion detection performance. For chest X-rays and angiography, we achieve an mAP of 0.25 for mass and 0.37 for stenosis detection, improving results by 4 and 7 percentage points, respectively. These results highlight the potential of our approach to advance robust and generalizable detection systems for medical imaging.