Generative structural elucidation from mass spectra as an iterative optimization problem

📅 2026-02-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of annotating chemical structures in liquid chromatography–tandem mass spectrometry (LC-MS/MS) data when reference spectra or known structures are unavailable for the majority of spectral features. To this end, the authors formalize structure elucidation as an iterative optimization task and introduce a scalable generative paradigm. Their approach integrates a molecular formula-constrained graph-based genetic algorithm with in silico mass spectral simulation to iteratively generate and refine candidate structures. This framework can be deployed independently or used to augment existing inverse prediction models. Experimental results on the NIST’20 and MassSpecGym datasets demonstrate that the proposed method substantially enhances structural annotation accuracy under reference-free conditions.

Technology Category

Application Category

📝 Abstract
Liquid chromatography tandem mass spectrometry (LC-MS/MS) is a critical analytical technique for molecular identification across metabolomics, environmental chemistry, and chemical forensics. A variety of computational methods have emerged for structural annotation of spectral features of interest, but many of these features cannot be confidently annotated with reference structures or spectra. Here, we introduce FOAM (Formula-constrained Optimization for Annotating Metabolites), a computational workflow that poses structure elucidation from LC-MS/MS as an iterative optimization problem. FOAM couples a formula-constrained graph genetic algorithm with spectral simulation to explore candidate annotations given an experimental spectrum. We demonstrate FOAM's performance on the NIST'20 and MassSpecGym datasets as both a standalone elucidation pipeline and as a complement to existing inverse models. This work establishes iterative optimization as an effective and extensible paradigm for structural elucidation.
Problem

Research questions and friction points this paper is trying to address.

structural elucidation
mass spectrometry
molecular identification
spectral annotation
metabolomics
Innovation

Methods, ideas, or system contributions that make the work stand out.

iterative optimization
graph genetic algorithm
formula-constrained
spectral simulation
structural elucidation
🔎 Similar Papers
M
Mrunali Manjrekar
Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
Runzhong Wang
Runzhong Wang
Postdoc, MIT
combinatorial optimizationcomputational metabolomicsgraph matching
S
Samuel Goldman
Computational and Systems Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
J
Jenna C. Fromer
Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
Connor W. Coley
Connor W. Coley
Massachusetts Institute of Technology
machine learningdrug discoveryautomationsynthetic chemistry