Layer-to-Layer Knowledge Mixing in Graph Neural Network for Chemical Property Prediction

📅 2025-10-23

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Graph neural networks (GNNs) achieve strong performance in molecular property prediction, yet accuracy gains often incur substantial computational and memory overhead. To address this trade-off, we propose Layer-wise Knowledge Mixing (LKM), a self-knowledge distillation framework that minimizes embedding distance among hidden representations across layers, enabling efficient integration of multi-hop and multi-scale structural information without increasing inference cost. LKM is architecture-agnostic—compatible with DimeNet++, MXMNet, PAMNet, and others—and leverages pretrained layer embeddings for optimization. Evaluated on QM9, MD17, and Chignolin, LKM reduces mean prediction error by 9.8%, 45.3%, and 22.9%, respectively, significantly improving both generalization and accuracy. This work establishes a new paradigm for lightweight, high-fidelity molecular modeling.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) are the currently most effective methods for predicting molecular properties but there remains a need for more accurate models. GNN accuracy can be improved by increasing the model complexity but this also increases the computational cost and memory requirement during training and inference. In this study, we develop Layer-to-Layer Knowledge Mixing (LKM), a novel self-knowledge distillation method that increases the accuracy of state-of-the-art GNNs while adding negligible computational complexity during training and inference. By minimizing the mean absolute distance between pre-existing hidden embeddings of GNN layers, LKM efficiently aggregates multi-hop and multi-scale information, enabling improved representation of both local and global molecular features. We evaluated LKM using three diverse GNN architectures (DimeNet++, MXMNet, and PAMNet) using datasets of quantum chemical properties (QM9, MD17 and Chignolin). We found that the LKM method effectively reduces the mean absolute error of quantum chemical and biophysical property predictions by up to 9.8% (QM9), 45.3% (MD17 Energy), and 22.9% (Chignolin). This work demonstrates the potential of LKM to significantly improve the accuracy of GNNs for chemical property prediction without any substantial increase in training and inference cost.

Problem

Research questions and friction points this paper is trying to address.

Improving GNN accuracy for molecular property prediction without computational overhead

Aggregating multi-scale molecular features through layer-to-layer knowledge distillation

Reducing prediction errors in quantum chemical properties with minimal cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-knowledge distillation method for GNN accuracy improvement

Minimizes distance between hidden embeddings in layers

Aggregates multi-hop and multi-scale molecular information

🔎 Similar Papers

Bi-level Contrastive Learning for Knowledge-Enhanced Molecule Representations