🤖 AI Summary
Rule-based machine learning models (e.g., decision trees, rule sets) are widely deployed in high-stakes domains, yet their explanations often lack formal credibility due to structural flaws such as negative overlap and redundancy.
Method: This paper introduces a formal diagnostic framework that integrates rule system theory with satisfiability (SAT) verification to systematically model and detect unreliable explanation patterns in rule models.
Contribution/Results: Empirical evaluation reveals that prevalent rule-learning tools—including RIPPER and C4.5—frequently produce rule sets exhibiting these defects. Crucially, this work establishes the first logic-based quantification of explanation reliability, grounded in formal consistency criteria. It provides (i) theoretically grounded reliability criteria for explainable AI, (ii) an automated detection algorithm implementable via off-the-shelf SAT solvers, and (iii) verifiable, semantics-preserving refinement strategies to enhance model trustworthiness. The framework bridges theoretical rigor and practical deployability, advancing the foundations of trustworthy, interpretable AI.
📝 Abstract
A task of interest in machine learning (ML) is that of ascribing explanations to the predictions made by ML models. Furthermore, in domains deemed high risk, the rigor of explanations is paramount. Indeed, incorrect explanations can and will mislead human decision makers. As a result, and even if interpretability is acknowledged as an elusive concept, so-called interpretable models are employed ubiquitously in high-risk uses of ML and data mining (DM). This is the case for rule-based ML models, which encompass decision trees, diagrams, sets and lists. This paper relates explanations with well-known undesired facets of rule-based ML models, which include negative overlap and several forms of redundancy. The paper develops algorithms for the analysis of these undesired facets of rule-based systems, and concludes that well-known and widely used tools for learning rule-based ML models will induce rule sets that exhibit one or more negative facets.