🤖 AI Summary
This study addresses the problem of detecting change points in the causal dependence of an outcome variable $Y$ on covariates $X$ in the presence of confounding variables, without prior knowledge of the change location. The work proposes the first fully nonparametric method that avoids strong structural assumptions by constructing a test statistic based on the integrated difference of kernel mean embeddings of conditional copulas. This statistic is provably zero under the null hypothesis of no change and strictly positive otherwise. Combining kernel methods, copula theory, and conditional distribution embeddings, the approach yields an estimator with near-linear time complexity and comes with theoretical guarantees on convergence rates. Empirical evaluations demonstrate high detection accuracy across multiple synthetic and real-world datasets, effectively enabling the identification of shifts in causal mechanisms.
📝 Abstract
We propose a framework for determining whether the causal dependence of an outcome $Y$ on a covariate $X$ changes at a given time point, given confounders $\boldsymbol{Z}$. For instance, in financial markets, the effect of a market indicator on asset returns may causally change over time. While many existing measures of association can be used to detect changes in joint and marginal distributions, in the absence of strong assumptions on the data generating process none are suitable for detecting changes in the causal mechanism or in the strength of causal relationship. In this work we approach the problem from a fully non-parametric perspective, and treat the causal mechanism as well as the distribution of the data as unknown. We introduce a quantity based on the integrated difference between kernel mean embeddings of certain conditionals copula, which is provably equal to zero if the causal dependence does not change and strictly positive else. A near-linear time estimator for the quantity is proposed, with rates of convergence explicitly spelled out. Extensive experiments demonstrate that the proposed statistic achieves high accuracy on multiple synthetic and real-world datasets. We additionally show how the proposed statistic can be used for change point detection when the goal is to detect changes in causal dependence occurring at an unknown times.