🤖 AI Summary
Existing molecular foundation models predominantly rely on SMILES representations, neglecting experimentally derived spectroscopic (NMR/IR/MS) and 3D structural information—limiting their performance in stereochemical analysis, conformational prediction, and experimental validation. To address this, we propose the first multimodal foundation model integrating experimental NMR, IR, and MS spectra with molecular 3D conformations. Built upon the Qwen2.5-7B architecture, it employs multi-task learning to unify SMILES, spectral, and spatial representations. Crucially, it enables end-to-end generation from spectra to SMILES to 3D conformations, bridging spectral interpretation, structural elucidation, and de novo design. On spectral classification, it achieves a mean accuracy of 0.53; for Spectra-to-SMILES generation, sequence accuracy reaches 15.5% and token accuracy 41.7%; and its 3D structure generation significantly outperforms general-purpose LLMs, enhancing practical utility in drug discovery and related domains.
📝 Abstract
Recent advances in molecular foundation models have shown impressive performance in molecular property prediction and de novo molecular design, with promising applications in areas such as drug discovery and reaction prediction. Nevertheless, most existing approaches rely exclusively on SMILES representations and overlook both experimental spectra and 3D structural information-two indispensable sources for capturing molecular behavior in real-world scenarios. This limitation reduces their effectiveness in tasks where stereochemistry, spatial conformation, and experimental validation are critical. To overcome these challenges, we propose MolSpectLLM, a molecular foundation model pretrained on Qwen2.5-7B that unifies experimental spectroscopy with molecular 3D structure. By explicitly modeling molecular spectra, MolSpectLLM achieves state-of-the-art performance on spectrum-related tasks, with an average accuracy of 0.53 across NMR, IR, and MS benchmarks. MolSpectLLM also shows strong performance on the spectra analysis task, obtaining 15.5% sequence accuracy and 41.7% token accuracy on Spectra-to-SMILES, substantially outperforming large general-purpose LLMs. More importantly, MolSpectLLM not only achieves strong performance on molecular elucidation tasks, but also generates accurate 3D molecular structures directly from SMILES or spectral inputs, bridging spectral analysis, molecular elucidation, and molecular design.