SpecBridge: Bridging Mass Spectrometry and Molecular Representations via Cross-Modal Alignment

📅 2026-01-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of small-molecule identification in untargeted mass spectrometry, which is hindered by incomplete spectral libraries. The authors propose an implicit cross-modal alignment framework that reframes molecular structure identification as a geometric alignment problem: the DreaMS mass spectrum encoder is fine-tuned to align its embedding space with that of a frozen ChemBERTa molecular foundation model, enabling efficient retrieval via cosine similarity. This approach circumvents the need for training a cross-modal model from scratch or generating molecular graphs atom-by-atom. Evaluated on MassSpecGym, Spectraverse, and MSnLib benchmarks, the method achieves a 20–25% relative improvement in Top-1 accuracy over strong baselines while using only a small number of trainable parameters, substantially enhancing both computational efficiency and training stability.

Technology Category

Application Category

📝 Abstract
Small-molecule identification from tandem mass spectrometry (MS/MS) remains a bottleneck in untargeted settings where spectral libraries are incomplete. While deep learning offers a solution, current approaches typically fall into two extremes: explicit generative models that construct molecular graphs atom-by-atom, or joint contrastive models that learn cross-modal subspaces from scratch. We introduce SpecBridge, a novel implicit alignment framework that treats structure identification as a geometric alignment problem. SpecBridge fine-tunes a self-supervised spectral encoder (DreaMS) to project directly into the latent space of a frozen molecular foundation model (ChemBERTa), and then performs retrieval by cosine similarity to a fixed bank of precomputed molecular embeddings. Across MassSpecGym, Spectraverse, and MSnLib benchmarks, SpecBridge improves top-1 retrieval accuracy by roughly 20-25% relative to strong neural baselines, while keeping the number of trainable parameters small. These results suggest that aligning to frozen foundation models is a practical, stable alternative to designing new architectures from scratch. The code for SpecBridge is released at https://github.com/HassounLab/SpecBridge.
Problem

Research questions and friction points this paper is trying to address.

small-molecule identification
tandem mass spectrometry
spectral libraries
cross-modal alignment
untargeted metabolomics
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-modal alignment
foundation model
mass spectrometry
molecular representation
implicit alignment
🔎 Similar Papers
No similar papers found.