๐ค AI Summary
This study addresses the challenge of causal discovery in multivariate extreme-value scenarios, where sparse extreme observations, strong dependencies, and latent confounders hinder reliable inference. The authors propose the principle of tail-induced asymmetry and introduce S3ME, a two-stage data-driven framework: it first recovers the causal skeleton via proxy-adjusted penalized neighborhood selection, then orients edges by minimizing tail prediction risk under a max-linear envelope model. This approach is the first to exploit tail asymmetry for identifying causal directions in heavy-tailed systems, achieving identifiable causal discovery without assuming a known graph structureโeven in high-dimensional settings with latent variables. The method enjoys theoretical guarantees for consistent skeleton estimation and demonstrates robustness to unobserved confounders, scalability, and the ability to recover sparse, interpretable propagation structures in real-world river and financial tail-risk networks.
๐ Abstract
Causal discovery in multivariate extremes is challenging because extreme observations are sparse, dependent, and often affected by latent common shocks. Existing approaches focus on undirected extremal dependence, require prior graph restriction, and do not scale beyond small systems. We introduce tail-induced asymmetry as a principle for causal directionality in heavy-tailed systems, where extreme events propagate asymmetrically so that forward tail prediction is systematically easier than backward prediction. We show that this asymmetry yields identifiable causal direction under a canonical max-linear model and provides a basis for score-based structure learning in the tail regime. Building on this, we propose Sparse Structure diScovery in Multivariate Extremes (S3ME), a two-stage data-driven framework for causal discovery. The first stage performs proxy-adjusted penalized neighbourhood selection to recover a sparse candidate skeleton under latent confounding. The second stage orients edges by minimizing tail prediction risk based on max-linear envelope models, exploiting directional asymmetry. We establish high-dimensional guarantees for skeleton screening and consistency of the score-based estimator under population separation conditions. Simulations demonstrate robustness to latent confounding and favourable scaling relative to existing extremal methods. Applications to river network data and financial tail-risk networks show that the approach recovers sparse, interpretable propagation structures without prespecified graph structure.