NMIRacle: Multi-modal Generative Molecular Elucidation from IR and NMR Spectra

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the end-to-end generation of molecular structures from multimodal infrared (IR) and dual-nucleus (¹H/¹³C) NMR spectra—a long-standing challenge in analytical chemistry. We propose a two-stage conditional generative framework: first, a count-aware molecular fragment encoder enables accurate structural reconstruction; second, a joint spectral encoder maps IR and both NMR modalities into a unified conditional embedding to guide generation. Our approach introduces, for the first time, count-aware fragment representations and a multimodal spectral joint conditioning mechanism, enabling fully automated structure elucidation without expert-defined rules or candidate structure libraries. Built upon a synergistic architecture integrating deep generative models, conditional variational autoencoders, and pretrained generators, our method achieves a 12.6% absolute improvement in top-1 accuracy on standard elucidation benchmarks and demonstrates strong robustness for highly complex molecules.

Technology Category

Application Category

📝 Abstract
Molecular structure elucidation from spectroscopic data is a long-standing challenge in Chemistry, traditionally requiring expert interpretation. We introduce NMIRacle, a two-stage generative framework that builds upon recent paradigms in AI-driven spectroscopy with minimal assumptions. In the first stage, NMIRacle learns to reconstruct molecular structures from count-aware fragment encodings, which capture both fragment identities and their occurrences. In the second stage, a spectral encoder maps input spectroscopic measurements (IR, 1H-NMR, 13C-NMR) into a latent embedding that conditions the pre-trained generator. This formulation bridges fragment-level chemical modeling with spectral evidence, yielding accurate molecular predictions. Empirical results show that NMIRacle outperforms existing baselines on molecular elucidation, while maintaining robust performance across increasing levels of molecular complexity.
Problem

Research questions and friction points this paper is trying to address.

Automates molecular structure elucidation from spectroscopic data
Generates molecular structures using IR and NMR spectral inputs
Improves accuracy over existing methods for complex molecules
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage generative framework for molecular structure elucidation
Fragment encodings capture identities and occurrences for reconstruction
Spectral encoder conditions generator with IR and NMR embeddings
🔎 Similar Papers
No similar papers found.