🤖 AI Summary
Graph neural networks (GNNs) achieve strong performance in molecular property prediction, yet their black-box nature and complex message-passing mechanisms yield unreliable atom- or substructure-level attributions, hindering trustworthy deployment in critical domains such as drug discovery. To address this, we propose SEAL—a chemically grounded, causally informed attribution framework that decomposes molecular graphs into chemically plausible substructures, enforces constrained message passing to explicitly suppress inter-fragment information flow, and jointly optimizes for both prediction and attribution. Unlike existing methods, SEAL enables rigorous, causal fragment-level contribution quantification. Empirical evaluation on synthetic and real-world molecular datasets demonstrates significant improvements in attribution fidelity. Furthermore, user studies with domain-expert chemists confirm that SEAL’s attributions align more closely with chemical intuition, offering enhanced transparency, interpretability, and trustworthiness without compromising predictive accuracy.
📝 Abstract
Graph neural networks have demonstrated remarkable success in predicting molecular properties by leveraging the rich structural information encoded in molecular graphs. However, their black-box nature reduces interpretability, which limits trust in their predictions for important applications such as drug discovery and materials design. Furthermore, existing explanation techniques often fail to reliably quantify the contribution of individual atoms or substructures due to the entangled message-passing dynamics. We introduce SEAL (Substructure Explanation via Attribution Learning), a new interpretable graph neural network that attributes model predictions to meaningful molecular subgraphs. SEAL decomposes input graphs into chemically relevant fragments and estimates their causal influence on the output. The strong alignment between fragment contributions and model predictions is achieved by explicitly reducing inter-fragment message passing in our proposed model architecture. Extensive evaluations on synthetic benchmarks and real-world molecular datasets demonstrate that SEAL outperforms other explainability methods in both quantitative attribution metrics and human-aligned interpretability. A user study further confirms that SEAL provides more intuitive and trustworthy explanations to domain experts. By bridging the gap between predictive performance and interpretability, SEAL offers a promising direction for more transparent and actionable molecular modeling.