SpecX: A Large-Scale Benchmark for Multi-Modal Spectroscopy and Cross-Paradigm Evaluation

📅 2026-05-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
Existing spectral benchmarks are limited in scale, modality alignment, and evaluation scope, making it difficult to uniformly assess both specialized models and multimodal language models. This work proposes SpecX, a large-scale multimodal spectral benchmark that integrates six spectroscopic modalities—NMR, IR, MS, UV, Raman, and fluorescence—covering 1.7 million molecules and supporting tasks such as molecular interpretation, spectral simulation, and comprehension. For the first time, a three-tier data framework encompassing pretraining, alignment benchmarks, and experimental validation is established, along with a cross-paradigm evaluation protocol compatible with both specialized and multimodal language models. Experiments reveal that while specialized models excel at signal-level modeling, multimodal language models, despite their high-level reasoning capabilities, lack precise spectral grounding, underscoring the need for spectrum-native foundation models.
📝 Abstract
Existing spectral benchmarks are limited in scale, modality alignment, and evaluation scope, and typically focus on either specialized models or multimodal language models (MLLMs). We introduce SpecX, a large-scale benchmark for multi-modal spectroscopy with cross-paradigm evaluation. SpecX contains 1.7M molecules with diverse spectral modalities, including NMR (1H, 13C, HSQC), IR, MS,UV,Raman and FL, and is organized into three tiers: a large-scale dataset for pretraining, an aligned multi-spectral subset for benchmarking, and a high-quality experimental subset for evaluation. SpecX supports a range of tasks such as molecular elucidation, spectrum simulation, and spectral understanding, and enables unified evaluation across both specialized spectral models and MLLMs. Experiments show that specialized models excel at signal-level modeling, while MLLMs exhibit strengths in high-level reasoning but lack precise spectral grounding. SpecX establishes a unified benchmark for spectral intelligence and highlights the need for spectrum-native foundation models.
Problem

Research questions and friction points this paper is trying to address.

spectral benchmark
multi-modal spectroscopy
cross-paradigm evaluation
molecular elucidation
spectrum-native foundation models
Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-modal spectroscopy
large-scale benchmark
cross-paradigm evaluation
spectrum-native foundation models
molecular elucidation
🔎 Similar Papers
No similar papers found.