Score-based Integrated Gradient for Root Cause Explanations of Outliers

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the challenge of root cause identification in high-dimensional, nonlinear causal systems with uncertainty by proposing a score-based attribution method. The approach estimates the score function—the gradient of the log-likelihood—and integrates gradients along the path from an anomalous point back to the normal data distribution to accumulate feature contributions. To the best of our knowledge, this is the first method to directly leverage the score function for root cause attribution, satisfying multiple axioms of Shapley values as well as an asymmetry axiom derived from the underlying causal structure. The method is both scalable and uncertainty-aware. Experimental results on synthetic graph data and real-world datasets from cloud services and supply chains demonstrate superior attribution accuracy and computational efficiency compared to state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

Identifying the root causes of outliers is a fundamental problem in causal inference and anomaly detection. Traditional approaches based on heuristics or counterfactual reasoning often struggle under uncertainty and high-dimensional dependencies. We introduce SIREN, a novel and scalable method that attributes the root causes of outliers by estimating the score functions of the data likelihood. Attribution is computed via integrated gradients that accumulate score contributions along paths from the outlier toward the normal data distribution. Our method satisfies three of the four classic Shapley value axioms - dummy, efficiency, and linearity - as well as an asymmetry axiom derived from the underlying causal structure. Unlike prior work, SIREN operates directly on the score function, enabling tractable and uncertainty-aware root cause attribution in nonlinear, high-dimensional, and heteroscedastic causal models. Extensive experiments on synthetic random graphs and real-world cloud service and supply chain datasets show that SIREN outperforms state-of-the-art baselines in both attribution accuracy and computational efficiency.

Problem

Research questions and friction points this paper is trying to address.

root cause

outliers

causal inference

anomaly detection

high-dimensional dependencies

Innovation

Methods, ideas, or system contributions that make the work stand out.

score-based attribution

integrated gradients

root cause analysis