A New Targeted-Federated Learning Framework for Estimating Heterogeneity of Treatment Effects: A Robust Framework with Applications in Aging Cohorts

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Estimating heterogeneous treatment effects (HTE) under multi-source data is challenging due to population heterogeneity and privacy constraints that hinder data pooling. Method: We propose the first doubly robust framework integrating targeted learning with federated learning. It employs a projection-based estimator to harmonize distributed datasets, corrects for covariate distribution shift while preserving privacy, and adaptively identifies non-transferable data sources. The framework supports both continuous and binary outcomes. Contribution/Results: We introduce a communication-efficient Bootstrap selection algorithm to enhance cross-site generalizability. Simulation studies demonstrate substantial improvements over state-of-the-art methods. Applied to nationally linked U.S. Medicare data, our approach successfully uncovers clinically meaningful HTE patterns in elderly cohorts, validating its practical utility, scalability, and robustness in real-world federated healthcare settings.

Technology Category

Application Category

📝 Abstract
Analyzing data from multiple sources offers valuable opportunities to improve the estimation efficiency of causal estimands. However, this analysis also poses many challenges due to population heterogeneity and data privacy constraints. While several advanced methods for causal inference in federated settings have been developed in recent years, many focus on difference-based averaged causal effects and are not designed to study effect modification. In this study, we introduce a novel targeted-federated learning framework to study the heterogeneity of treatment effects (HTEs) for a targeted population by proposing a projection-based estimand. This HTE framework integrates information from multiple data sources without sharing raw data, while accounting for covariate distribution shifts among sources. Our proposed approach is shown to be doubly robust, conveniently supporting both difference-based estimands for continuous outcomes and odds ratio-based estimands for binary outcomes. Furthermore, we develop a communication-efficient bootstrap-based selection procedure to detect non-transportable data sources, thereby enhancing robust information aggregation without introducing bias. The superior performance of the proposed estimator over existing methods is demonstrated through extensive simulation studies, and the utility of our approach has been shown in a real-world data application using nationwide Medicare-linked data.
Problem

Research questions and friction points this paper is trying to address.

Estimating treatment effect heterogeneity across populations
Integrating multi-source data under privacy constraints
Developing robust causal inference with distribution shifts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Targeted-federated learning for heterogeneous treatment effects
Doubly robust projection-based estimand without data sharing
Bootstrap-based selection for non-transportable data sources
🔎 Similar Papers
No similar papers found.
R
Rong Zhao
Department of Public Health Sciences, Penn State College of Medicine
J
Jason Falvey
Department of Physical Therapy and Rehabilitation Science, University of Maryland School of Medicine
Xu Shi
Xu Shi
University of Michigan
Electronic Health RecordCausal InferenceNegative ControlMachine Translation
V
Vernon M. Chinchilli
Department of Public Health Sciences, Penn State College of Medicine
Chixiang Chen
Chixiang Chen
Associate Professor in Biostatistics, University of Maryland School of Medicine, Baltimore.
Statistics and Biostatistics