CT-IDP: Segmentation-Derived Quantitative Phenotypes for Interpretable Abdominal CT Disease Classification

📅 2026-05-09

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Abdominal CT–based disease classification often lacks interpretability and standardized quantitative phenotypic support. This work proposes CT-IDP, the first organ segmentation–driven high-dimensional quantitative phenotyping framework, which leverages TotalSegmentator to extract over 900 morphological, density, and contextual features from multi-institutional CT scans. An interpretable classification model is constructed using elastic net–regularized logistic regression and benchmarked against a DINOv3 vision transformer baseline. Evaluated on three external datasets—MERLIN, Duke-Abdomen, and AMOS—the proposed model achieves macro-AUCs of 0.897, 0.877, and 0.780, respectively, significantly outperforming the baseline. The approach demonstrates strong interpretability, reproducibility, and cross-institutional generalizability in abdominal CT disease classification.

📝 Abstract

In this retrospective multi-institutional study, a quantitative phenotyping framework, CT-IDP (CT Image-Derived Phenotypes) was developed on the MERLIN abdominal CT benchmark (training, validation, and test sets- 15,175, 5,018, and 5,082 studies, respectively) and externally evaluated on two independent dataset: Duke-Abdomen (2,000) and AMOS (1,107). Multi-organ segmentations were generated with TotalSegmentator and used to derive over 900 organ and compartment-level descriptors spanning morphometry, attenuation, and contextual/burden findings. Sparse disease-specific logistic regression with elastic-net regularization was trained on MERLIN and externally validated under a frozen specification. Performance was compared against a DINOv3-based vision-transformer baseline using AUC and average precision (AP), supported by phenotype-stratified audits and coefficient-level inspection. Macro-AUC for CT-IDP versus the baseline was 0.897 versus 0.880 on MERLIN, 0.877 versus 0.857 on the Duke-Abdomen dataset, and 0.780 versus 0.756 on AMOS.

Problem

Research questions and friction points this paper is trying to address.

abdominal CT

disease classification

quantitative phenotyping

interpretable AI

medical image analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

quantitative phenotyping

interpretable AI

multi-organ segmentation