Causal Explanations for Disparate Trends: Where and Why?

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Explaining differences between two populations in high-dimensional data remains challenging due to the lack of interpretable, actionable attributions. Method: This paper proposes ExDis—the first automated framework integrating causal inference into difference explanation. ExDis jointly performs subgroup discovery and causal effect estimation to precisely identify subregions where differences are statistically significant or reversed, and isolates features with genuine causal influence on those differences. It combines rigorous causal inference, scalable subgroup optimization, and high-dimensional feature selection. Results: Evaluated on three real-world datasets, ExDis outperforms existing methods in both explanation accuracy and interpretability, while demonstrating strong scalability. Its causal explanations are not only human-understandable but also operationally meaningful—enabling data-driven decision-making grounded in deep, mechanistic attribution.

Technology Category

Application Category

📝 Abstract

During data analysis, we are often perplexed by certain disparities observed between two groups of interest within a dataset. To better understand an observed disparity, we need explanations that can pinpoint the data regions where the disparity is most pronounced, along with its causes, i.e., factors that alleviate or exacerbate the disparity. This task is complex and tedious, particularly for large and high-dimensional datasets, demanding an automatic system for discovering explanations (data regions and causes) of an observed disparity. It is critical that explanations for disparities are not only interpretable but also actionable-enabling users to make informed, data-driven decisions. This requires explanations to go beyond surface-level correlations and instead capture causal relationships. We introduce ExDis, a framework for discovering causal Explanations for Disparities between two groups of interest. ExDis identifies data regions (subpopulations) where disparities are most pronounced (or reversed), and associates specific factors that causally contribute to the disparity within each identified data region. We formally define the ExDis framework and the associated optimization problem, analyze its complexity, and develop an efficient algorithm to solve the problem. Through extensive experiments over three real-world datasets, we demonstrate that ExDis generates meaningful causal explanations, outperforms prior methods, and scales effectively to handle large, high-dimensional datasets.

Problem

Research questions and friction points this paper is trying to address.

Identifies data regions with pronounced group disparities

Discovers causal factors contributing to observed disparities

Automates explanation generation for large, high-dimensional datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework discovers causal explanations for group disparities

Identifies subpopulations with pronounced or reversed disparities

Associates specific causal factors within each data region

🔎 Similar Papers

The Emerging Generative Artificial Intelligence Divide in the United States