ClusterSC: Advancing Synthetic Control with Donor Selection

πŸ“… 2025-03-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
In individual-level observational studies, the synthetic control (SC) method suffers from the curse of dimensionality and degraded performance due to high-dimensional donor pools. To address this, we propose ClusterSCβ€”a clustering-driven donor selection framework that integrates unsupervised clustering (e.g., K-means or spectral clustering) directly into the SC pipeline: donor units are first partitioned into clusters, and for each treated unit, donors are selected and weighted exclusively from its nearest cluster. We derive an improved theoretical bound on estimation error and incorporate cross-validation to optimize cluster number and donor composition. Extensive experiments on multiple synthetic benchmarks and real-world healthcare and economic datasets demonstrate that ClusterSC reduces average causal effect estimation error by 22–38% over standard SC. Moreover, it significantly enhances scalability, robustness to noise and heterogeneity, and out-of-sample generalization capability.

Technology Category

Application Category

πŸ“ Abstract
In causal inference with observational studies, synthetic control (SC) has emerged as a prominent tool. SC has traditionally been applied to aggregate-level datasets, but more recent work has extended its use to individual-level data. As they contain a greater number of observed units, this shift introduces the curse of dimensionality to SC. To address this, we propose Cluster Synthetic Control (ClusterSC), based on the idea that groups of individuals may exist where behavior aligns internally but diverges between groups. ClusterSC incorporates a clustering step to select only the relevant donors for the target. We provide theoretical guarantees on the improvements induced by ClusterSC, supported by empirical demonstrations on synthetic and real-world datasets. The results indicate that ClusterSC consistently outperforms classical SC approaches.
Problem

Research questions and friction points this paper is trying to address.

Addresses curse of dimensionality in synthetic control for individual-level data
Proposes clustering to select relevant donors for target unit
Improves performance over classical synthetic control methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

ClusterSC uses clustering for donor selection
Addresses curse of dimensionality in synthetic control
Improves performance over classical SC methods
πŸ”Ž Similar Papers
No similar papers found.