Beyond single-model XAI: aggregating multi-model explanations for enhanced trustworthiness

📅 2025-10-13

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Current XAI methods suffer from insufficient explanation robustness and low decision transparency in high-stakes scenarios, undermining AI system trustworthiness. To address this, we propose a Multi-Model Feature Importance Aggregation (MMFIA) framework that fuses local feature importance explanations from heterogeneous models—including k-nearest neighbors, random forests, and neural networks—to mitigate single-model bias and enhance explanation stability and reproducibility. MMFIA imposes no assumptions on model homogeneity or differentiability, supports black-box system integration, and incorporates a weighted consistency mechanism to suppress noisy explanations. Experiments across multiple high-risk benchmark datasets demonstrate that MMFIA significantly improves explanation robustness (average gain of 23.6%) and user trust, while preserving predictive performance. This work establishes a scalable, model-agnostic paradigm for explanation aggregation, advancing the development of trustworthy AI systems.

Technology Category

Application Category

📝 Abstract

The use of Artificial Intelligence (AI) models in real-world and high-risk applications has intensified the discussion about their trustworthiness and ethical usage, from both a technical and a legislative perspective. The field of eXplainable Artificial Intelligence (XAI) addresses this challenge by proposing explanations that bring to light the decision-making processes of complex black-box models. Despite being an essential property, the robustness of explanations is often an overlooked aspect during development: only robust explanation methods can increase the trust in the system as a whole. This paper investigates the role of robustness through the usage of a feature importance aggregation derived from multiple models ($k$-nearest neighbours, random forest and neural networks). Preliminary results showcase the potential in increasing the trustworthiness of the application, while leveraging multiple model's predictive power.

Problem

Research questions and friction points this paper is trying to address.

Aggregating multi-model explanations to enhance AI trustworthiness

Addressing robustness limitations in single-model XAI methods

Leveraging multiple models' predictive power for reliable explanations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Aggregating feature importance from multiple models

Using k-nearest neighbours, random forest, neural networks

Enhancing explanation robustness for trustworthy AI systems

🔎 Similar Papers

Why do explanations fail? A typology and discussion on failures in XAI