🤖 AI Summary
Existing GNNs struggle to capture long-range dependencies in molecular graphs, limiting their performance in molecular property prediction for drug discovery. To address this, we propose a dual-scale graph neural modeling framework: at the atom level, we employ an xLSTM-enhanced GNN with skip connections to jointly model local and long-range structural patterns; at the motif level, we construct complementary graph views and fuse multi-granularity representations via a multi-head mixture-of-experts (MHMoE) module. This work introduces the first dual-level xLSTM architecture for molecular graphs, pioneers the integration of skip knowledge with xLSTM for atomic representation learning, and innovatively incorporates motif-level graph construction and MHMoE to enhance both interpretability and expressive power. Evaluated on ten molecular property prediction benchmarks, our method consistently outperforms state-of-the-art baselines: it achieves an average AUROC gain of 3.18% on classification tasks and an average RMSE reduction of 3.83% on regression tasks, with notable improvements of 7.03% on BBBP and 7.54% on ESOL.
📝 Abstract
Predicting molecular properties is essential for drug discovery, and computational methods can greatly enhance this process. Molecular graphs have become a focus for representation learning, with Graph Neural Networks (GNNs) widely used. However, GNNs often struggle with capturing long-range dependencies. To address this, we propose MolGraph-xLSTM, a novel graph-based xLSTM model that enhances feature extraction and effectively models molecule long-range interactions. Our approach processes molecular graphs at two scales: atom-level and motif-level. For atom-level graphs, a GNN-based xLSTM framework with jumping knowledge extracts local features and aggregates multilayer information to capture both local and global patterns effectively. Motif-level graphs provide complementary structural information for a broader molecular view. Embeddings from both scales are refined via a multi-head mixture of experts (MHMoE), further enhancing expressiveness and performance. We validate MolGraph-xLSTM on 10 molecular property prediction datasets, covering both classification and regression tasks. Our model demonstrates consistent performance across all datasets, with improvements of up to 7.03% on the BBBP dataset for classification and 7.54% on the ESOL dataset for regression compared to baselines. On average, MolGraph-xLSTM achieves an AUROC improvement of 3.18% for classification tasks and an RMSE reduction of 3.83% across regression datasets compared to the baseline methods. These results confirm the effectiveness of our model, offering a promising solution for molecular representation learning for drug discovery.