A hierarchical modelling approach for Bayesian Causal Forests on longitudinal data: A Case Study in Multiple Sclerosis Clinical Trials

📅 2025-08-11

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

In longitudinal studies, within-subject correlation, time-varying confounding, and heterogeneous treatment effects severely impede valid causal inference. Existing Bayesian methods—such as Bayesian Causal Forests (BCF)—fail to capture dynamic intra-individual dependencies and are ill-suited for repeated-measures settings. To address this, we propose BCFLong: the first longitudinal causal model integrating hierarchical random intercepts and slopes into the Bayesian Additive Regression Trees (BART) framework. BCFLong decouples prognostic and treatment-effect components, employs horseshoe priors to enforce sparsity in effect heterogeneity, and leverages MCMC for full Bayesian inference. It is the first BART-based approach to systematically account for both within-subject correlation and time-varying confounding in longitudinal causal estimation. Evaluated on the NO.MS cohort of multiple sclerosis patients, BCFLong significantly improves precision in estimating individualized treatment effects and uncovers clinically meaningful, temporally evolving patterns of brain volume change—patterns missed by conventional methods.

Technology Category

Application Category

📝 Abstract

Long-running clinical trials offer a unique opportunity to study disease progression and treatment response over time, enabling questions about how and when interventions alter patient trajectories. However, drawing causal conclusions in this setting is challenging due to irregular follow-up, individual-level heterogeneity, and time-varying confounding. Bayesian Additive Regression Trees (BART) and their extension, Bayesian Causal Forests (BCF), have proven powerful for flexible causal inference in observational data, especially for heterogeneous treatment effects and non-linear outcome surfaces. Yet, both models assume independence across observations and are fundamentally limited in their ability to model within-individual correlation over time. This limits their use in real-world longitudinal settings where repeated measures are the norm. Motivated by the NO.MS dataset, the largest and most comprehensive clinical trial dataset in Multiple Sclerosis (MS), with more than 35,000 patients and up to 15 years follow-up, we develop BCFLong, a hierarchical model that preserves BART's strengths while extending it for longitudinal analysis. Inspired by BCF, we decompose the mean into prognostic and treatment effects, modelling the former on Image Quality Metrics (IQMs) to account for scanner effects, and introduce individual-specific random effects, including intercepts and slope, with a sparsity-inducing horseshoe prior. Simulations confirm BCFLong's superior performance and robustness to sparsity, significantly improving outcome and treatment effect estimation. On NO.MS, BCFLong captures clinically meaningful longitudinal patterns in brain volume change, which would have otherwise remained undetected. These findings highlight the importance of adaptively accounting for within-individual correlations and position BCFLong as a flexible framework for causal inference in longitudinal data.

Problem

Research questions and friction points this paper is trying to address.

Modeling causal effects in longitudinal clinical trial data

Addressing within-individual correlation in Bayesian Causal Forests

Improving treatment effect estimation in Multiple Sclerosis studies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Bayesian Causal Forests for longitudinal data

Individual-specific random effects with sparsity-inducing prior

Models prognostic and treatment effects separately

🔎 Similar Papers

Targeting Relative Risk Heterogeneity with Causal Forests