Beyond the Single-Best Model: Rashomon Partial Dependence Profile for Trustworthy Explanations in AutoML

📅 2025-07-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
AutoML typically outputs a single optimal model, neglecting explanation uncertainty and thus failing to meet reliability and transparency requirements for explainable AI in high-stakes applications. To address this, we propose the Rashomon PDP framework—the first to incorporate model multiplicity into partial dependence plot (PDP) construction. It aggregates PDPs across the Rashomon set (i.e., the collection of near-optimal models) to yield robust, uncertainty-aware PDPs that explicitly characterize interpretive variability in feature effects. We further introduce two quantitative metrics—coverage rate and average confidence interval width—to measure explanation consistency. Experiments on 35 regression datasets from the OpenML CTR23 benchmark show that conventional single-model PDPs cover Rashomon PDPs less than 70% of the time, revealing their inherent interpretive fragility. In contrast, Rashomon PDPs substantially improve explanation stability and credibility, establishing a human-centered paradigm for trustworthy, interpretable AI.

Technology Category

Application Category

📝 Abstract
Automated machine learning systems efficiently streamline model selection but often focus on a single best-performing model, overlooking explanation uncertainty, an essential concern in human centered explainable AI. To address this, we propose a novel framework that incorporates model multiplicity into explanation generation by aggregating partial dependence profiles (PDP) from a set of near optimal models, known as the Rashomon set. The resulting Rashomon PDP captures interpretive variability and highlights areas of disagreement, providing users with a richer, uncertainty aware view of feature effects. To evaluate its usefulness, we introduce two quantitative metrics, the coverage rate and the mean width of confidence intervals, to evaluate the consistency between the standard PDP and the proposed Rashomon PDP. Experiments on 35 regression datasets from the OpenML CTR23 benchmark suite show that in most cases, the Rashomon PDP covers less than 70% of the best model's PDP, underscoring the limitations of single model explanations. Our findings suggest that Rashomon PDP improves the reliability and trustworthiness of model interpretations by adding additional information that would otherwise be neglected. This is particularly useful in high stakes domains where transparency and confidence are critical.
Problem

Research questions and friction points this paper is trying to address.

Addresses explanation uncertainty in AutoML model selection
Proposes aggregating partial dependence profiles from near optimal models
Improves reliability of model interpretations in high stakes domains
Innovation

Methods, ideas, or system contributions that make the work stand out.

Aggregates partial dependence profiles from near-optimal models
Introduces coverage rate and confidence interval metrics
Enhances reliability with Rashomon set-based explanations
🔎 Similar Papers
No similar papers found.
Mustafa Cavus
Mustafa Cavus
Eskisehir Technical University, Department of Statistics
Statistical Machine LearningExplainable Artificial IntelligenceDesign of Experiment
Jan N. van Rijn
Jan N. van Rijn
Leiden University
Automated Machine LearningMetalearningTrustworthy AI
P
Przemysław Biecek
Faculty of Mathematics and Information Science, Warsaw University of Technology, Poland; Informatics and Mechanics, University of Warsaw, Faculty of Mathematics, Poland