🤖 AI Summary
Existing XAI research lacks quantitative, psychometrically validated methods to assess users’ comprehension of tabular features, limiting the practical utility of model explanations.
Method: We propose the first psychometrically validated framework for quantifying feature understandability, comprising two parallel factor scales—one for numerical and one for categorical features—comprising 17 items total. Structural validity and reliability are rigorously established via confirmatory factor analysis. Crucially, the scales explicitly integrate user cognitive dimensions into item design.
Contribution/Results: This framework enables quantitative scoring and prioritization of input features based on their understandability. Empirical evaluation demonstrates excellent model fit and discriminant validity. The resulting scores support more human-centered global explanation generation. By grounding explainability assessment in user cognition and measurement theory, our work establishes a reusable, user-centric methodological foundation for XAI evaluation.
📝 Abstract
As artificial intelligence becomes increasingly pervasive and powerful, the ability to audit AI-based systems is becoming increasingly important. However, explainability for artificial intelligence systems is not a one-size-fits-all solution; different target audiences have varying requirements and expectations for explanations. While various approaches to explainability have been proposed, most explainable artificial intelligence (XAI) methods for tabular data focus on explaining the outputs of supervised machine learning models using the input features. However, a user's ability to understand an explanation depends on their understanding of such features. Therefore, it is in the best interest of the system designer to try to pre-select understandable features for producing a global explanation of an ML model. Unfortunately, no measure currently exists to assess the degree to which a user understands a given input feature. This work introduces psychometrically validated scales that quantitatively seek to assess users' understanding of tabular input features for supervised classification problems. In detail, these scales, one for numerical and one for categorical data, each with two factors and comprising 8 and 9 items, aim to assign a score to each input feature, effectively producing a rank, and allowing for the quantification of feature prioritisation. A confirmatory factor analysis demonstrates a strong relationship between such items and a good fit of the two-factor structure for each scale. This research presents a novel method for assessing understanding and outlines potential applications in the domain of explainable artificial intelligence.