🤖 AI Summary
Existing statistical models struggle to jointly integrate individual-level survival data, tissue composition at the multicellular level, and single-cell omics features for identifying cell-type-specific prognostic genes. To address this, we propose the Bayesian Generalized Promotion Time Cure Model (BGPTM), the first framework enabling integrative modeling across three biological scales—individual survival outcomes, cell-type abundances, and single-cell gene expression—while supporting high-dimensional variable selection and survival prediction. BGPTM incorporates a Bayesian sparse prior, a cell-type-specific effect structure, and a time-dependent cure mechanism. In both simulation studies and real-world cancer datasets, BGPTM significantly improves covariate identification accuracy and prognostic prediction performance, achieving an average 0.08–0.12 increase in C-index. The model provides an interpretable, scalable statistical framework for dissecting tumor microenvironment-driven heterogeneity in patient prognosis.
📝 Abstract
Single-cell technologies provide an unprecedented opportunity for dissecting the interplay between the cancer cells and the associated tumor microenvironment, and the produced high-dimensional omics data should also augment existing survival modeling approaches for identifying tumor cell type-specific genes predictive of cancer patient survival. However, there is no statistical model to integrate multiscale data including individual-level survival data, multicellular-level cell composition data and cellular-level single-cell omics covariates. We propose a class of Bayesian generalized promotion time cure models (GPTCMs) for the multiscale data integration to identify cell-type-specific genes and improve cancer prognosis. We demonstrate with simulations in both low- and high-dimensional settings that the proposed Bayesian GPTCMs are able to identify cell-type-associated covariates and improve survival prediction.