A New Targeted-Federated Learning Framework for Estimating Heterogeneity of Treatment Effects: A Robust Framework with Applications in Aging Cohorts

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Estimating heterogeneous treatment effects (HTE) under multi-source data is challenging due to population heterogeneity and privacy constraints that hinder data pooling. Method: We propose the first doubly robust framework integrating targeted learning with federated learning. It employs a projection-based estimator to harmonize distributed datasets, corrects for covariate distribution shift while preserving privacy, and adaptively identifies non-transferable data sources. The framework supports both continuous and binary outcomes. Contribution/Results: We introduce a communication-efficient Bootstrap selection algorithm to enhance cross-site generalizability. Simulation studies demonstrate substantial improvements over state-of-the-art methods. Applied to nationally linked U.S. Medicare data, our approach successfully uncovers clinically meaningful HTE patterns in elderly cohorts, validating its practical utility, scalability, and robustness in real-world federated healthcare settings.

Technology Category

Application Category

📝 Abstract

Analyzing data from multiple sources offers valuable opportunities to improve the estimation efficiency of causal estimands. However, this analysis also poses many challenges due to population heterogeneity and data privacy constraints. While several advanced methods for causal inference in federated settings have been developed in recent years, many focus on difference-based averaged causal effects and are not designed to study effect modification. In this study, we introduce a novel targeted-federated learning framework to study the heterogeneity of treatment effects (HTEs) for a targeted population by proposing a projection-based estimand. This HTE framework integrates information from multiple data sources without sharing raw data, while accounting for covariate distribution shifts among sources. Our proposed approach is shown to be doubly robust, conveniently supporting both difference-based estimands for continuous outcomes and odds ratio-based estimands for binary outcomes. Furthermore, we develop a communication-efficient bootstrap-based selection procedure to detect non-transportable data sources, thereby enhancing robust information aggregation without introducing bias. The superior performance of the proposed estimator over existing methods is demonstrated through extensive simulation studies, and the utility of our approach has been shown in a real-world data application using nationwide Medicare-linked data.

Problem

Research questions and friction points this paper is trying to address.

Estimating treatment effect heterogeneity across populations

Integrating multi-source data under privacy constraints

Developing robust causal inference with distribution shifts

Innovation

Methods, ideas, or system contributions that make the work stand out.

Targeted-federated learning for heterogeneous treatment effects

Doubly robust projection-based estimand without data sharing

Bootstrap-based selection for non-transportable data sources

🔎 Similar Papers

FedECA: A Federated External Control Arm Method for Causal Inference with Time-To-Event Data in Distributed Settings