ClusterSC: Advancing Synthetic Control with Donor Selection

📅 2025-03-27

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

In individual-level observational studies, the synthetic control (SC) method suffers from the curse of dimensionality and degraded performance due to high-dimensional donor pools. To address this, we propose ClusterSC—a clustering-driven donor selection framework that integrates unsupervised clustering (e.g., K-means or spectral clustering) directly into the SC pipeline: donor units are first partitioned into clusters, and for each treated unit, donors are selected and weighted exclusively from its nearest cluster. We derive an improved theoretical bound on estimation error and incorporate cross-validation to optimize cluster number and donor composition. Extensive experiments on multiple synthetic benchmarks and real-world healthcare and economic datasets demonstrate that ClusterSC reduces average causal effect estimation error by 22–38% over standard SC. Moreover, it significantly enhances scalability, robustness to noise and heterogeneity, and out-of-sample generalization capability.

Technology Category

Application Category

📝 Abstract

In causal inference with observational studies, synthetic control (SC) has emerged as a prominent tool. SC has traditionally been applied to aggregate-level datasets, but more recent work has extended its use to individual-level data. As they contain a greater number of observed units, this shift introduces the curse of dimensionality to SC. To address this, we propose Cluster Synthetic Control (ClusterSC), based on the idea that groups of individuals may exist where behavior aligns internally but diverges between groups. ClusterSC incorporates a clustering step to select only the relevant donors for the target. We provide theoretical guarantees on the improvements induced by ClusterSC, supported by empirical demonstrations on synthetic and real-world datasets. The results indicate that ClusterSC consistently outperforms classical SC approaches.

Problem

Research questions and friction points this paper is trying to address.

Addresses curse of dimensionality in synthetic control for individual-level data

Proposes clustering to select relevant donors for target unit

Improves performance over classical synthetic control methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

ClusterSC uses clustering for donor selection

Addresses curse of dimensionality in synthetic control

Improves performance over classical SC methods

🔎 Similar Papers

Synthesizing Interpretable Control Policies through Large Language Model Guided Search

2024-10-07arXiv.orgCitations: 0

Bosch Group

Renningen, BW, DE

PhD – Generative Models for Closed-loop Synthesis

Bosch Group

Renningen, BW, DE

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)