Universal CT Representations from Anatomy to Disease Phenotype through Agglomerative Pretraining

๐Ÿ“… 2026-05-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

176K/year
๐Ÿค– AI Summary
This work proposes FlexiCT, a general-purpose foundation model for computed tomography (CT) imaging that overcomes the limitations of existing AI models confined to single tasks and lacking universal representational capacity. FlexiCT leverages a three-stage progressive clustering pretraining strategy on over 260,000 CT scans, integrating 2D axial views, 3D anatomical structures, and report-guided visionโ€“language semantic alignment to learn a unified representation spanning anatomy to disease phenotypes. Evaluated across five diverse downstream tasks, FlexiCT matches or exceeds the performance of specialized models. Its learned embedding space effectively captures clinically relevant phenotypes, such as tumor staging, and the model is accompanied by a large-scale, publicly available resource for CT representation learning.
๐Ÿ“ Abstract
Computed tomography (CT) is a central to three-dimensional medical imaging, yet CT-based artificial intelligence remains fragmented across task-specific models for segmentation, classification, registration, and report analysis. Here we present FlexiCT, a family of CT foundation models trained by agglomerative continual pretraining on 266,227 CT volumes from 56 publicly available datasets, forming a large-scale public resource for CT representation learning. FlexiCT uses agglomerative pretraining across three stages: two-dimensional axial pretraining, three-dimensional anatomical pretraining and report-guided semantic alignment. This training strategy supports slice-level, volume-level and vision-language analysis. Across five downstream task families (segmentation, classification, registration, vision-language understanding and clinical retrieval), FlexiCT matches or exceeds prior task-specific approaches on multiple benchmarks. Its embeddings further organize CT scans along gradients associated with various tumor stages, suggesting that CT foundation models can capture imaging features relevant to disease phenotype characterization. Code is available at https://github.com/ricklisz/FlexiCT
Problem

Research questions and friction points this paper is trying to address.

CT foundation models
medical imaging
representation learning
disease phenotype
universal representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

agglomerative pretraining
CT foundation model
vision-language alignment
universal representation
disease phenotype characterization
๐Ÿ”Ž Similar Papers
No similar papers found.