Influence of Classification Task and Distribution Shift Type on OOD Detection in Fetal Ultrasound

📅 2025-09-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Ensuring reliable out-of-distribution (OOD) detection in fetal ultrasound imaging is critical for the safe clinical deployment of deep learning models under heterogeneous real-world conditions. Method: We systematically evaluate eight uncertainty quantification methods across four representative classification tasks (e.g., gestational age staging, anatomical structure identification) and multiple in-distribution–OOD splits reflecting distinct distribution shift patterns—including image quality degradation, anatomical variation, and inter-device discrepancies. Contribution/Results: We find that OOD detection performance is highly contingent on the semantic alignment between task objectives and shift types; no single task universally dominates across shifts. Crucially, we demonstrate—for the first time in medical AI—that high OOD detection accuracy does not guarantee optimal rejection decisions under clinical constraints. Our work establishes a task–shift–uncertainty co-design paradigm, enabling context-aware, robust deployment of AI systems in dynamic clinical environments.

Technology Category

Application Category

📝 Abstract
Reliable out-of-distribution (OOD) detection is important for safe deployment of deep learning models in fetal ultrasound amidst heterogeneous image characteristics and clinical settings. OOD detection relies on estimating a classification model's uncertainty, which should increase for OOD samples. While existing research has largely focused on uncertainty quantification methods, this work investigates the impact of the classification task itself. Through experiments with eight uncertainty quantification methods across four classification tasks, we demonstrate that OOD detection performance significantly varies with the task, and that the best task depends on the defined ID-OOD criteria; specifically, whether the OOD sample is due to: i) an image characteristic shift or ii) an anatomical feature shift. Furthermore, we reveal that superior OOD detection does not guarantee optimal abstained prediction, underscoring the necessity to align task selection and uncertainty strategies with the specific downstream application in medical image analysis.
Problem

Research questions and friction points this paper is trying to address.

Investigating how classification task choice affects OOD detection in fetal ultrasound
Analyzing how different distribution shift types impact OOD detection performance
Examining the relationship between OOD detection and abstained prediction performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Investigates classification task impact on OOD detection
Evaluates eight uncertainty methods across four tasks
Aligns task selection with specific downstream application
🔎 Similar Papers
No similar papers found.
C
Chun Kit Wong
Technical University of Denmark, Kongens Lyngby, Denmark
A
Anders N. Christensen
Technical University of Denmark, Kongens Lyngby, Denmark
Cosmin I. Bercea
Cosmin I. Bercea
Technical University of Munich
Computer VisionMultimodal LearningGenerative AIAnomaly DetectionMedical Image Analysis
J
Julia A. Schnabel
Technical University of Munich, Munich, Germany
M
Martin G. Tolsgaard
University of Copenhagen, Copenhagen, Denmark
Aasa Feragen
Aasa Feragen
Professor, DTU Compute
Machine learningmedical imaginggeometric modelling