The Missing Piece: A Case for Pre-Training in 3D Medical Object Detection

📅 2025-09-19

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

Large-scale pretraining remains systematically unexplored in 3D medical object detection, with existing approaches predominantly leveraging 2D or natural-image pretraining—thus neglecting intrinsic 3D volumetric features. Method: This work presents the first systematic investigation of large-scale pretraining tailored to 3D medical detection, encompassing both CNN and Transformer architectures under three paradigms: voxel/image reconstruction-based self-supervised learning, supervised learning, and contrastive learning. Contribution/Results: Self-supervised pretraining via volumetric or slice-wise reconstruction consistently yields substantial gains in detection performance (e.g., mAP), whereas contrastive learning shows no stable improvement. Pretraining delivers consistent mAP improvements across diverse 3D medical benchmarks—including LiTS and BTCV—bridging a critical gap in pretraining research for detection relative to segmentation. All code is publicly released.

Technology Category

Application Category

📝 Abstract

Large-scale pre-training holds the promise to advance 3D medical object detection, a crucial component of accurate computer-aided diagnosis. Yet, it remains underexplored compared to segmentation, where pre-training has already demonstrated significant benefits. Existing pre-training approaches for 3D object detection rely on 2D medical data or natural image pre-training, failing to fully leverage 3D volumetric information. In this work, we present the first systematic study of how existing pre-training methods can be integrated into state-of-the-art detection architectures, covering both CNNs and Transformers. Our results show that pre-training consistently improves detection performance across various tasks and datasets. Notably, reconstruction-based self-supervised pre-training outperforms supervised pre-training, while contrastive pre-training provides no clear benefit for 3D medical object detection. Our code is publicly available at: https://github.com/MIC-DKFZ/nnDetection-finetuning.

Problem

Research questions and friction points this paper is trying to address.

Exploring pre-training for 3D medical object detection

Addressing underutilization of 3D volumetric information

Evaluating pre-training methods across detection architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pre-training for 3D medical detection

Systematic study integrating pre-training methods

Reconstruction-based self-supervised pre-training outperforms supervised

🔎 Similar Papers

Self Pre-training with Topology- and Spatiality-aware Masked Autoencoders for 3D Medical Image Segmentation