🤖 AI Summary
In 3D molecular graph masked modeling, unintended leakage of 2D structural information into the decoder undermines learning of genuine 3D geometric relationships. To address this, we propose Selective Re-Masking Decoding (SRD): during decoding, only masked atoms are reconstructed, their local 2D adjacency is explicitly masked, while the global 2D topology is retained as contextual constraint. SRD is integrated within a masked graph autoencoder framework, coupling a 3D Relational-Transformer encoder with a structure-agnostic decoder to strengthen joint modeling of 3D conformations and bond orders. Evaluated on the MD17 benchmark across eight tasks, our method achieves state-of-the-art performance on seven—significantly improving 3D molecular property prediction accuracy. To our knowledge, this is the first approach to achieve synergistic optimization of 2D contextual preservation and 3D geometric awareness in molecular representation learning.
📝 Abstract
Masked graph modeling (MGM) is a promising approach for molecular representation learning (MRL).However, extending the success of re-mask decoding from 2D to 3D MGM is non-trivial, primarily due to two conflicting challenges: avoiding 2D structure leakage to the decoder, while still providing sufficient 2D context for reconstructing re-masked atoms.To address these challenges, we propose 3D-GSRD: a 3D Molecular Graph Auto-Encoder with Selective Re-mask Decoding. The core innovation of 3D-GSRD lies in its Selective Re-mask Decoding(SRD), which re-masks only 3D-relevant information from encoder representations while preserving the 2D graph structures.This SRD is synergistically integrated with a 3D Relational-Transformer(3D-ReTrans) encoder alongside a structure-independent decoder. We analyze that SRD, combined with the structure-independent decoder, enhances the encoder's role in MRL. Extensive experiments show that 3D-GSRD achieves strong downstream performance, setting a new state-of-the-art on 7 out of 8 targets in the widely used MD17 molecular property prediction benchmark. The code is released at https://github.com/WuChang0124/3D-GSRD.