Whole-body CT attenuation and volume charts from routine clinical scans via evidence-grounded LLM report filtering

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

179K/year
🤖 AI Summary
Clinical CT data are rich in pathological information, making it challenging to establish reference distributions of quantitative biomarkers for healthy populations. This study proposes an integrated large language model (LLM) framework that combines evidence anchoring with cross-validation to precisely filter pathological descriptions from over 350,000 CT reports, thereby constructing a low-pathology-bias cohort. Using this cohort, the authors establish the first whole-body CT attenuation and volume reference charts encompassing 106 anatomical structures, while accounting for covariates including age, sex, contrast agent use, and scan parameters. These charts are derived from a distribution-aware Generalized Additive Model for Location, Scale, and Shape (GAMLSS), enabling covariate-adjusted percentile scoring. The resulting reference standards provide a critical foundation for standardized quantitative phenotyping, multi-center studies, and opportunistic screening.
📝 Abstract
Interpreting quantitative CT biomarkers, such as organ volume and tissue attenuation, requires large-scale healthy reference distributions. However, creating these is challenging because clinical datasets are often heavily enriched with pathology. Here, we develop an evidence-grounded, cross-verified large language model (LLM) ensemble to filter pathological findings from radiology reports, enabling the construction of pathology-reduced cohorts from over 350,000 CT examinations. Five LLMs, first, flag structure-level abnormality candidates grounded in verbatim report evidence and, second, resolve disagreements via cross-verification. Using distribution-aware generalized additive models for location, scale, and shape, we establish comprehensive whole-body reference charts for 106 anatomical structures (volumes and attenuation) across adulthood, accounting for age, sex, contrast enhancement, and acquisition parameters. Longitudinal analyses reveal structure- and contrast-dependent changes distinct from cross-sectional trends. These resources facilitate covariate-adjusted centile scoring from routine CT, supporting standardized quantitative phenotyping, multi-site imaging studies, and scalable opportunistic screening research.
Problem

Research questions and friction points this paper is trying to address.

CT attenuation
organ volume
reference charts
pathology filtering
quantitative biomarkers
Innovation

Methods, ideas, or system contributions that make the work stand out.

large language model
CT attenuation
reference charts
pathology filtering
quantitative phenotyping