🤖 AI Summary
Current 3D chest CT analysis methods struggle to simultaneously achieve whole-volume disease identification, precise anomaly localization, and interpretability, while also lacking the capacity to preserve spatial evidence. This work proposes an interpretable, anomaly-aware vision foundation model that learns anatomy-aware, voxel-level anomaly representations through weakly supervised alignment of clinical CT scans with free-text radiology reports—without requiring voxel-level annotations. The model is the first to generate organ- and disease-specific 3D anomaly score maps, enabling zero-shot anomaly localization and visually grounded report generation. It jointly optimizes organ segmentation and multi-instance anomaly localization. In a large-scale, multinational, multicenter retrospective evaluation, the model significantly outperforms existing 3D medical foundation models across diverse tasks, including multi-disease diagnosis, zero-shot localization, downstream task transfer, and radiology report generation.
📝 Abstract
Chest computed tomography (CT) is central to the detection and management of thoracic disease, yet the growing scale and complexity of volumetric imaging increasingly exceed what can be addressed by scan-level prediction alone. Clinically useful AI for CT must not only recognize disease across the whole volume, but also localize abnormalities and provide interpretable visual evidence. Existing vision-language foundation models typically compress scans and reports into global image-text representations, limiting their ability to preserve spatial evidence and support clinically meaningful interpretation. Here we developed EXACT, an explainable anomaly-aware foundation model for three-dimensional chest CT that learns spatially resolved representations from paired clinical scans and radiology reports. EXACT was pre-trained on 25,692 CT-reports pairs using anatomy-aware weak supervision, jointly learning organ segmentation and multi-instance anomaly localization without manual voxel-level annotations. The resulting organ-specific anomaly-aware maps assign each voxel a disease-specific anomaly score confined to its corresponding anatomy, jointly encoding lesion extent and organ-level context. In retrospective multinational and multi-center evaluations, EXACT showed broad and consistent improvements across clinically relevant CT tasks, spanning multi-disease diagnosis, zero-shot anomaly localization, downstream adaptation, and visually grounded report generation, outperforming existing three-dimensional medical foundation models. By transforming routine clinical CT scans and free-text reports into explainable voxel-level representations, EXACT establishes a scalable paradigm for trustworthy volumetric medical AI.