NMRTrans: Structure Elucidation from Experimental NMR Spectra via Set Transformers

๐Ÿ“… 2026-02-10
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Nuclear magnetic resonance (NMR) spectral interpretation has long relied on expert knowledge, and existing computational methods suffer significant performance degradation on real experimental spectra. To address this challenge, this work introduces NMRSpec, the first large-scale dataset of experimental NMR spectra, and proposes NMRTrans, a structure-aware model that, for the first time, trains a Transformer exclusively on experimental ยนH/ยนยณC NMR data. By modeling spectra as unordered sets of peaks and employing a Set Transformer to align with the inherent physical properties of NMR, the method eliminates reliance on computationally simulated spectra. Evaluated on experimental benchmarks, NMRTrans achieves a Top-10 accuracy of 61.15%, outperforming the strongest baseline by 17.82 percentage points and substantially advancing the state of the art.

Technology Category

Application Category

๐Ÿ“ Abstract
Nuclear Magnetic Resonance (NMR) spectroscopy is fundamental for molecular structure elucidation, yet interpreting spectra at scale remains time-consuming and highly expertise-dependent. While recent spectrum-as-language modeling and retrieval-based methods have shown promise, they rely heavily on large corpora of computed spectra and exhibit notable performance drops when applied to experimental measurements. To address these issues, we build NMRSpec, a large-scale corpus of experimental $^1$H and $^{13}$C spectra mined from chemical literature, and propose NMRTrans, which models spectra as unordered peak sets and aligns the model's inductive bias with the physical nature of NMR. To our best knowledge, NMRTrans is the first NMR Transformer trained solely on large-scale experimental spectra and achieves state-of-the-art performance on experimental benchmarks, improving Top-10 Accuracy over the strongest baseline by +17.82 points (61.15% vs. 43.33%), and underscoring the importance of experimental data and structure-aware architectures for reliable NMR structure elucidation.
Problem

Research questions and friction points this paper is trying to address.

NMR spectroscopy
structure elucidation
experimental spectra
molecular structure
spectrum interpretation
Innovation

Methods, ideas, or system contributions that make the work stand out.

NMR spectroscopy
Set Transformers
experimental spectra
structure elucidation
NMRTrans
๐Ÿ”Ž Similar Papers
No similar papers found.