RESOLVE-IPD: High-Fidelity Individual Patient Data Reconstruction and Uncertainty-Aware Subgroup Meta-Analysis

📅 2025-11-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing KM curve digitization methods suffer from three key limitations: coordinate extraction errors, unrealistic assumptions of uniform censoring, and inability to reconstruct individual patient data (IPD) for subgroups from summary statistics. This paper introduces the first uncertainty-aware, subgroup-level IPD reconstruction framework. It integrates VEC-KM—enabling high-precision coordinate extraction—with CEN-KM—which corrects for non-uniform censoring—and innovatively incorporates the MAPLE algorithm to perform marginal-constrained subgroup label probability inference and evidence propagation, yielding statistically feasible multi-label sets. Evaluated on four phase III esophageal squamous cell carcinoma trials, the method significantly improves accuracy and reproducibility of treatment effect estimation in the PD-L1 low-expression subgroup. It establishes a new paradigm for IPD reconstruction in precision oncology: high-fidelity, interpretable, and quantitatively calibrated for uncertainty.

Technology Category

Application Category

📝 Abstract
Individual patient data (IPD) from oncology trials are essential for reliable evidence synthesis but are rarely publicly available, necessitating reconstruction from published Kaplan-Meier (KM) curves. Existing reconstruction methods suffer from digitization errors, unrealistic uniform censoring assumptions, and the inability to recover subgroup-level IPD when only aggregate statistics are available. We developed RESOLVE-IPD, a unified computational framework that enables high-fidelity IPD reconstruction and uncertainty-aware subgroup meta-analysis to address these limitations. RESOLVE-IPD comprises two components. The first component, High-Fidelity IPD Reconstruction, integrates the VEC-KM and CEN-KM modules: VEC-KM extracts precise KM coordinates and explicit censoring marks from vectorized figures, minimizing digitization error, while CEN-KM corrects overlapping censor symbols and eliminates the uniform censoring assumption. The second component, Uncertainty-Aware Subgroup Recovery, employs the MAPLE (Marginal Assignment of Plausible Labels and Evidence Propagation) algorithm to infer patient-level subgroup labels consistent with published summary statistics (e.g., hazard ratio, median overall survival) when subgroup KM curves are unavailable. MAPLE generates ensembles of mathematically valid labelings, facilitating a propagating meta-analysis that quantifies and reflects uncertainty from subgroup reconstruction. RESOLVE-IPD was validated through a subgroup meta-analysis of four trials in advanced esophageal squamous cell carcinoma, focusing on the programmed death ligand 1 (PD-L1)-low population. RESOLVE-IPD enables accurate IPD reconstruction and robust, uncertainty-aware subgroup meta-analyses, strengthening the reliability and transparency of secondary evidence synthesis in precision oncology.
Problem

Research questions and friction points this paper is trying to address.

Reconstructing individual patient data from oncology trial Kaplan-Meier curves
Addressing digitization errors and unrealistic censoring assumptions in reconstruction
Enabling subgroup meta-analysis when only aggregate statistics are available
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vectorized KM curve extraction minimizing digitization errors
Censoring correction eliminating uniform censoring assumptions
Uncertainty-aware subgroup recovery using ensemble labelings
🔎 Similar Papers
No similar papers found.
L
Lang Lang
Department of Applied Mathematics and Statistics, Johns Hopkins University
Y
Yao Zhao
Department of Applied Mathematics and Statistics, Johns Hopkins University
Q
Qiuxin Gao
Department of Applied Mathematics and Statistics, Johns Hopkins University
Yanxun Xu
Yanxun Xu
Johns Hopkins University
BayesianClinical trial DesignElectronic Health Record DataNetwork Data