DiffSpectra: Molecular Structure Elucidation from Spectra using Diffusion Models

๐Ÿ“… 2025-07-09
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Molecular structure elucidation from spectroscopic data is a fundamental challenge in chemical identification and drug discovery; however, traditional expert-based approaches lack scalability, while existing machine learning methods are constrained by reliance on reference databases or autoregressive SMILES generation, limiting generalization to novel molecules and neglecting 3D geometric information. This paper introduces DiffSpectraโ€”the first end-to-end generative framework jointly modeling multimodal spectra (e.g., NMR, IR, MS) and both 2D and 3D molecular structures. Its key contributions are: (1) an SE(3)-equivariant diffusion model for geometry-aware 3D molecular generation; (2) SpecFormer, a unified spectral encoder that jointly represents heterogeneous spectra and injects conditional information; and (3) a Diffusion Molecule Transformer as the core denoising backbone. On standard benchmarks, DiffSpectra achieves 16.01% top-1 and 96.86% top-20 accuracy, empirically validating the efficacy of 3D structural modeling, spectral pretraining, and multimodal conditional fusion.

Technology Category

Application Category

๐Ÿ“ Abstract
Molecular structure elucidation from spectra is a foundational problem in chemistry, with profound implications for compound identification, synthesis, and drug development. Traditional methods rely heavily on expert interpretation and lack scalability. Pioneering machine learning methods have introduced retrieval-based strategies, but their reliance on finite libraries limits generalization to novel molecules. Generative models offer a promising alternative, yet most adopt autoregressive SMILES-based architectures that overlook 3D geometry and struggle to integrate diverse spectral modalities. In this work, we present DiffSpectra, a generative framework that directly infers both 2D and 3D molecular structures from multi-modal spectral data using diffusion models. DiffSpectra formulates structure elucidation as a conditional generation process. Its denoising network is parameterized by Diffusion Molecule Transformer, an SE(3)-equivariant architecture that integrates topological and geometric information. Conditioning is provided by SpecFormer, a transformer-based spectral encoder that captures intra- and inter-spectral dependencies from multi-modal spectra. Extensive experiments demonstrate that DiffSpectra achieves high accuracy in structure elucidation, recovering exact structures with 16.01% top-1 accuracy and 96.86% top-20 accuracy through sampling. The model benefits significantly from 3D geometric modeling, SpecFormer pre-training, and multi-modal conditioning. These results highlight the effectiveness of spectrum-conditioned diffusion modeling in addressing the challenge of molecular structure elucidation. To our knowledge, DiffSpectra is the first framework to unify multi-modal spectral reasoning and joint 2D/3D generative modeling for de novo molecular structure elucidation.
Problem

Research questions and friction points this paper is trying to address.

Elucidating molecular structures from multi-modal spectral data
Overcoming limitations of traditional and retrieval-based methods
Integrating 2D/3D geometry with spectral information generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for molecular structure elucidation
Integrates 2D and 3D molecular structures
Employs transformer-based spectral encoder
๐Ÿ”Ž Similar Papers
No similar papers found.
L
Liang Wang
NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences.
Y
Yu Rong
DAMO Academy, Alibaba Group.
Tingyang Xu
Tingyang Xu
Alibaba DAMO Academy
Machine LearningDeep Graph LearningDrug Discovery
Z
Zhenyi Zhong
College of Intelligence and Computing, Tianjin University.
Z
Zhiyuan Liu
National University of Singapore.
Pengju Wang
Pengju Wang
Chinese Academy of Science
Deli Zhao
Deli Zhao
Alibaba DAMO Academy
generative modelsmultimodal learningfoundation models
Q
Qiang Liu
NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences.
S
Shu Wu
NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences.
L
Liang Wang
NLPR, MAIS, Institute of Automation, Chinese Academy of Sciences.