🤖 AI Summary
X-ray images suffer from feature entanglement due to depth superposition and object semi-transparency, severely hindering detection of small prohibited items. Existing DETR-based methods struggle with uneven content query distribution and inadequate representation of overlapping object hypotheses. To address these challenges, we propose a novel Multi-class Minimum-margin Contrastive Learning (MMCL) framework: (i) a multi-class repulsion loss to enhance inter-class separability; (ii) a minimum-margin clustering loss to improve intra-class diversity; and (iii) a self-supervised query distribution rectification mechanism. Evaluated on three major X-ray benchmarks—PIXray, OPIXray, and PIDray—MMCL achieves state-of-the-art performance, significantly improving detection accuracy in overlapping and semi-transparent scenarios, particularly boosting recall for small objects.
📝 Abstract
Unlike natural images with occlusion-based overlap, X-ray images exhibit depth-induced superimposition and semi-transparent appearances, where objects at different depths overlap and their features blend together. These characteristics demand specialized mechanisms to disentangle mixed representations between target objects (e.g., prohibited items) and irrelevant backgrounds. While recent studies have explored adapting detection transformers (DETR) for anti-overlapping object detection, the importance of well-distributed content queries that represent object hypotheses remains underexplored. In this paper, we introduce a multi-class min-margin contrastive learning (MMCL) framework to correct the distribution of content queries, achieving balanced intra-class diversity and inter-class separability. The framework first groups content queries by object category and then applies two proposed complementary loss components: a multi-class exclusion loss to enhance inter-class separability, and a min-margin clustering loss to encourage intra-class diversity. We evaluate the proposed method on three widely used X-ray prohibited-item detection datasets, PIXray, OPIXray, and PIDray, using two backbone networks and four DETR variants. Experimental results demonstrate that MMCL effectively enhances anti-overlapping object detection and achieves state-of-the-art performance on both datasets. Code will be made publicly available on GitHub.