๐ค AI Summary
Molecular conformational dynamics analysis has long relied on hand-crafted features, lacking general-purpose, automated geometric representations. To address this, we propose geom2vec: a self-supervised denoising pretraining framework based on equivariant graph neural networks (GNNs) that learns transferable 3D geometric features directly from large-scale molecular conformation datasets. Its novel architecture decouples the GNN backbone from a token mixer, jointly ensuring SE(3)-equivariance, interpretability, and computational efficiency. Crucially, geom2vec requires no fine-tuning or system-specific hyperparameter tuning, enabling conformational analysis of full-atom small proteins out-of-the-box. Experiments demonstrate that, under limited computational resources, geom2vec significantly improves robustness and generalization in conformational dynamics modelingโfully eliminating dependence on manual feature engineering.
๐ Abstract
Identifying informative low-dimensional features that characterize dynamics in molecular simulations remains a challenge, often requiring extensive manual tuning and system-specific knowledge. Here, we introduce geom2vec, in which pretrained graph neural networks (GNNs) are used as universal geometric featurizers. By pretraining equivariant GNNs on a large dataset of molecular conformations with a self-supervised denoising objective, we obtain transferable structural representations that are useful for learning conformational dynamics without further fine-tuning. We show how the learned GNN representations can capture interpretable relationships between structural units (tokens) by combining them with expressive token mixers. Importantly, decoupling training the GNNs from training for downstream tasks enables analysis of larger molecular graphs (such as small proteins at all-atom resolution) with limited computational resources. In these ways, geom2vec eliminates the need for manual feature selection and increases the robustness of simulation analyses.