The Missing Piece: A Case for Pre-Training in 3D Medical Object Detection

📅 2025-09-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large-scale pretraining remains systematically unexplored in 3D medical object detection, with existing approaches predominantly leveraging 2D or natural-image pretraining—thus neglecting intrinsic 3D volumetric features. Method: This work presents the first systematic investigation of large-scale pretraining tailored to 3D medical detection, encompassing both CNN and Transformer architectures under three paradigms: voxel/image reconstruction-based self-supervised learning, supervised learning, and contrastive learning. Contribution/Results: Self-supervised pretraining via volumetric or slice-wise reconstruction consistently yields substantial gains in detection performance (e.g., mAP), whereas contrastive learning shows no stable improvement. Pretraining delivers consistent mAP improvements across diverse 3D medical benchmarks—including LiTS and BTCV—bridging a critical gap in pretraining research for detection relative to segmentation. All code is publicly released.

Technology Category

Application Category

📝 Abstract
Large-scale pre-training holds the promise to advance 3D medical object detection, a crucial component of accurate computer-aided diagnosis. Yet, it remains underexplored compared to segmentation, where pre-training has already demonstrated significant benefits. Existing pre-training approaches for 3D object detection rely on 2D medical data or natural image pre-training, failing to fully leverage 3D volumetric information. In this work, we present the first systematic study of how existing pre-training methods can be integrated into state-of-the-art detection architectures, covering both CNNs and Transformers. Our results show that pre-training consistently improves detection performance across various tasks and datasets. Notably, reconstruction-based self-supervised pre-training outperforms supervised pre-training, while contrastive pre-training provides no clear benefit for 3D medical object detection. Our code is publicly available at: https://github.com/MIC-DKFZ/nnDetection-finetuning.
Problem

Research questions and friction points this paper is trying to address.

Exploring pre-training for 3D medical object detection
Addressing underutilization of 3D volumetric information
Evaluating pre-training methods across detection architectures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Pre-training for 3D medical detection
Systematic study integrating pre-training methods
Reconstruction-based self-supervised pre-training outperforms supervised
🔎 Similar Papers
No similar papers found.
K
Katharina Eckstein
Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
Constantin Ulrich
Constantin Ulrich
German Cancer Research Center (DKFZ)
Medical Image ComputingMedical physicsComputer Vision
M
Michael Baumgartner
Faculty of Mathematics and Computer Science, Heidelberg University, Germany
J
Jessica Kächele
Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
D
Dimitrios Bounias
Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
Tassilo Wald
Tassilo Wald
PhD Student, Deutsche Krebsforschungszentrum (DKFZ)
representation learningself-supervised learningmedical image analysis
Ralf Floca
Ralf Floca
Medical Image Computing, German Cancer Research Center (DKFZ)
medical image processinguncertainty quantificationoncologyradiologyradiation therapy
Klaus H. Maier-Hein
Klaus H. Maier-Hein
Professor, Medical Image Computing, German Cancer Research Center
Medical Image AnalysisMachine Learning