Physical probes expose and alleviate chemical-environment collapse in molecular representations

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

209K/year
🤖 AI Summary
This work addresses the pervasive issue of “chemical environment collapse” in molecular representation learning, where subtle differences in local chemical environments are overlooked, rendering topologically equivalent atoms indistinguishable. To mitigate this, the authors construct a high-quality dataset combining experimental and computed ¹³C NMR spectra and propose CLAIM—a framework that integrates hierarchical chemical priors with cross-level contrastive learning. Without requiring explicit 3D molecular structures, CLAIM effectively aligns topological molecular inputs with atomic-level NMR observations. The method substantially improves accuracy in atomic-resolution spectral retrieval and ¹³C NMR chemical shift prediction, demonstrates robustness on flexible and tautomeric systems, and successfully transfers to downstream ADMET and fluorescence property prediction tasks. This study provides the first systematic characterization and effective alleviation of chemical environment collapse in molecular representation learning.
📝 Abstract
Nuclear magnetic resonance (NMR) spectroscopy provides an experimental readout of local chemical environments, but its use in molecular representation learning has been constrained by heterogeneous data and incomplete atom-level assignments. Here we construct complementary high-fidelity experimental and computational 13C NMR resources, which reveal a recurrent form of representational collapse: atoms that are equivalent in molecular topology can remain experimentally distinct in their real chemical environments, whereas explicit 3D descriptions are further limited by static conformations in dynamic regimes. To alleviate this bottleneck, we develop CLAIM (Contrastive Learning for Atom-to-molecule Inference of Molecular NMR), a framework that aligns efficient topological molecular inputs with atom-resolved NMR observables. Through hierarchical chemical priors and cross-level contrastive learning, CLAIM restores lost chemical resolution and markedly improves atom-level molecule-spectrum retrieval. CLAIM remains robust in flexible and tautomeric systems for 13C NMR prediction, improves stereoisomer discrimination without explicit 3D modelling, and transfers to broader molecular property tasks including ADMET prediction and fluorescence estimation. These results establish physically grounded spectral alignment as an effective strategy for alleviating chemical-environment collapse and for guiding experimentally grounded molecular representation learning.
Problem

Research questions and friction points this paper is trying to address.

chemical-environment collapse
molecular representation learning
NMR spectroscopy
atom-level resolution
dynamic molecular systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

chemical-environment collapse
contrastive learning
molecular representation learning
NMR spectroscopy
atom-resolved prediction
🔎 Similar Papers
No similar papers found.
J
Jiebin Fang
Hainan Institute, Zhejiang University, Sanya 572025, China; Institute of Marine Biology and Pharmacology, Ocean College, Zhejiang University, Zhoushan 316021, China
Z
Zidi Yan
Hainan Institute, Zhejiang University, Sanya 572025, China; Institute of Marine Biology and Pharmacology, Ocean College, Zhejiang University, Zhoushan 316021, China
C
Churu Mao
Institute of Marine Biology and Pharmacology, Ocean College, Zhejiang University, Zhoushan 316021, China
Y
Yongjun Jiang
School of Food and Pharmacy, Zhejiang Ocean University, Zhoushan 316021, China
X
Xinyi Tang
Institute of Marine Biology and Pharmacology, Ocean College, Zhejiang University, Zhoushan 316021, China
L
Lei Miao
Hangzhou Bio-Sincerity Pharmaceutical Technology Company Limited, Hangzhou 311103, China
Dan Lu
Dan Lu
Chinese University of Hong Kong
International Trade
Y
Yun Huang
Institute of Marine Biology and Pharmacology, Ocean College, Zhejiang University, Zhoushan 316021, China
W
Wanjing Ding
Hainan Institute, Zhejiang University, Sanya 572025, China; Institute of Marine Biology and Pharmacology, Ocean College, Zhejiang University, Zhoushan 316021, China
Z
Zhongjun Ma
Hainan Institute, Zhejiang University, Sanya 572025, China; Institute of Marine Biology and Pharmacology, Ocean College, Zhejiang University, Zhoushan 316021, China