🤖 AI Summary
To address the challenge of objectively quantifying user trust in XAI systems—where subjective questionnaires and isolated performance metrics fall short—this paper introduces the first quantitative evaluation framework that jointly integrates model performance with multi-source objective trust signals. The framework unifies classification accuracy, F1-score, explanation consistency, cognitive load, and fine-grained user behavioral logs into a computable composite trust metric. Empirically validated on a real-world pneumonia diagnosis task using chest X-rays, the framework achieves a trust prediction correlation of r = 0.82—significantly outperforming conventional approaches. It enables data-driven, iterative refinement of XAI systems and has been endorsed by clinical domain experts. By moving beyond reliance on subjective self-reports, this work overcomes a critical bottleneck in XAI trust assessment and establishes a new empirical paradigm for evaluating trustworthy AI.
📝 Abstract
The increasing reliance on Deep Learning models, combined with their inherent lack of transparency, has spurred the development of a novel field of study known as eXplainable AI (XAI) methods. These methods seek to enhance the trust of end-users in automated systems by providing insights into the rationale behind their decisions. This paper presents a novel approach for measuring user trust in XAI systems, allowing their refinement. Our proposed metric combines both performance metrics and trust indicators from an objective perspective. To validate this novel methodology, we conducted a case study in a realistic medical scenario: the usage of XAI system for the detection of pneumonia from x-ray images.