🤖 AI Summary
Time-series counterfactual estimation faces two key challenges: the inaccessibility of unobserved counterfactual trajectories and persistent confounding by dynamic confounders over time. To address these, we propose a novel framework integrating collaborative deconfounding and temporal generalization. Its core innovations are Subgroup Treatment Alignment (SGA) and Random Temporal Masking (RTM). SGA achieves fine-grained covariate balance via iterative treatment-agnostic clustering and latent-space distribution alignment across subgroups. RTM enhances reliance on stable historical patterns—and mitigates temporal drift—by randomly masking covariates during training. These components are synergistically combined with adversarial learning and sequence modeling. Extensive experiments on multiple benchmark datasets demonstrate that our joint approach significantly outperforms individual ablations and existing state-of-the-art methods, yielding substantial gains in counterfactual prediction accuracy and temporal robustness.
📝 Abstract
Estimating counterfactual outcomes from time-series observations is crucial for effective decision-making, e.g. when to administer a life-saving treatment, yet remains significantly challenging because (i) the counterfactual trajectory is never observed and (ii) confounders evolve with time and distort estimation at every step. To address these challenges, we propose a novel framework that synergistically integrates two complementary approaches: Sub-treatment Group Alignment (SGA) and Random Temporal Masking (RTM). Instead of the coarse practice of aligning marginal distributions of the treatments in latent space, SGA uses iterative treatment-agnostic clustering to identify fine-grained sub-treatment groups. Aligning these fine-grained groups achieves improved distributional matching, thus leading to more effective deconfounding. We theoretically demonstrate that SGA optimizes a tighter upper bound on counterfactual risk and empirically verify its deconfounding efficacy. RTM promotes temporal generalization by randomly replacing input covariates with Gaussian noises during training. This encourages the model to rely less on potentially noisy or spuriously correlated covariates at the current step and more on stable historical patterns, thereby improving its ability to generalize across time and better preserve underlying causal relationships. Our experiments demonstrate that while applying SGA and RTM individually improves counterfactual outcome estimation, their synergistic combination consistently achieves state-of-the-art performance. This success comes from their distinct yet complementary roles: RTM enhances temporal generalization and robustness across time steps, while SGA improves deconfounding at each specific time point.