🤖 AI Summary
Graph Neural Networks (GNNs) suffer from poor interpretability—operating as “black boxes”—hindering their adoption in high-stakes domains like drug discovery.
Method: We propose LOGICXGNN, the first model-agnostic, data-driven framework for automatic extraction of first-order logical rules from GNNs. It integrates symbolic inductive logic programming, differentiable logical reasoning, and GNN feature distillation—requiring no predefined domain knowledge—to derive global, concept-level explanations.
Contribution/Results: The extracted rules achieve superior discriminative performance (outperforming base GNNs on MUTAG, BBBP, and other benchmarks) and strong generative capability, enabling counterfactual graph synthesis and controllable molecular structure design. Experiments uncover previously overlooked chemical substructure patterns—ignored by mainstream methods—and demonstrate cross-task rule transferability. LOGICXGNN establishes a new paradigm for trustworthy, AI-driven drug discovery.
📝 Abstract
Graph neural networks (GNNs) operate over both input feature spaces and combinatorial graph structures, making it challenging to understand the rationale behind their predictions. As GNNs gain widespread popularity and demonstrate success across various domains, such as drug discovery, studying their interpretability has become a critical task. To address this, many explainability methods have been proposed, with recent efforts shifting from instance-specific explanations to global concept-based explainability. However, these approaches face several limitations, such as relying on predefined concepts and explaining only a limited set of patterns. To address this, we propose a novel framework, LOGICXGNN, for extracting interpretable logic rules from GNNs. LOGICXGNN is model-agnostic, efficient, and data-driven, eliminating the need for predefined concepts. More importantly, it can serve as a rule-based classifier and even outperform the original neural models. Its interpretability facilitates knowledge discovery, as demonstrated by its ability to extract detailed and accurate chemistry knowledge that is often overlooked by existing methods. Another key advantage of LOGICXGNN is its ability to generate new graph instances in a controlled and transparent manner, offering significant potential for applications such as drug design. We empirically demonstrate these merits through experiments on real-world datasets such as MUTAG and BBBP.