π€ AI Summary
In aerosol-based matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS), low single-shot signal-to-noise ratio (SNR), reliance on labor-intensive preprocessing, and multi-spectrum averaging hinder real-time pathogen monitoring.
Method: This paper proposes MS-DGFormerβa novel end-to-end framework integrating a singular value decomposition (SVD)-based denoising spectral dictionary encoder with a Transformer architecture. It directly processes raw single-shot MALDI-MS spectra without manual preprocessing or spectral averaging. A dictionary-guided mechanism enhances noise robustness and effectively captures long-range peak correlations across the mass spectrum.
Contribution/Results: MS-DGFormer significantly improves recognition accuracy of unknown biomolecular patterns. Experiments on real aerosol samples demonstrate high-accuracy pathogen identification from a single acquisition, enabling portable, autonomous, real-time field deployment for biological threat detection.
π Abstract
Matrix Assisted Laser Desorption/Ionization Mass Spectrometry (MALDI-MS) is a cornerstone in biomolecular analysis, offering precise identification of pathogens through unique mass spectral signatures. Yet, its reliance on labor-intensive sample preparation and multi-shot spectral averaging restricts its use to laboratory settings, rendering it impractical for real-time environmental monitoring. These limitations are especially pronounced in emerging aerosol MALDI-MS systems, where autonomous sampling generates noisy spectra for unknown aerosol analytes, requiring single-shot detection for effective analysis. Addressing these challenges, we propose the Mass Spectral Dictionary-Guided Transformer (MS-DGFormer): a data-driven framework that redefines spectral analysis by directly processing raw, minimally prepared mass spectral data. MS-DGFormer leverages a transformer architecture, designed to capture the long-range dependencies inherent in these time-series spectra. To enhance feature extraction, we introduce a novel dictionary encoder that integrates denoised spectral information derived from Singular Value Decomposition (SVD), enabling the model to discern critical biomolecular patterns from single-shot spectra with robust performance. This innovation provides a system to achieve superior pathogen identification from aerosol samples, facilitating autonomous, real-time analysis in field conditions. By eliminating the need for extensive preprocessing, our method unlocks the potential for portable, deployable MALDI-MS platforms, revolutionizing environmental pathogen detection and rapid response to biological threats.