NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

📅 2026-02-24
🏛️ bioRxiv
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes the first general-purpose EEG-to-text foundation model capable of generating clinically meaningful and interpretable narratives from electroencephalography (EEG) signals. Addressing the limitations of existing methods—which are often confined to specific tasks or coarse-grained recognition—the study introduces NeuroCorpus-160K, a standardized EEG–text paired corpus comprising 160,000 samples. The model integrates spectral-spatial contrastive learning, state-space temporal modeling, and conditional generation with large language models to jointly encode EEG topographic maps and time-series dynamics. Experimental results demonstrate that the framework effectively captures spatiotemporal–spectral dynamics, significantly improving the accuracy and interpretability of clinical narrative generation across multiple benchmarks and zero-shot transfer tasks.
📝 Abstract
Electroencephalography (EEG) provides a non-invasive window into neural dynamics at high temporal resolution and plays a pivotal role in clinical neuroscience research. Despite this potential, prevailing computational approaches to EEG analysis remain largely confined to task-specific classification objectives or coarse-grained pattern recognition, offering limited support for clinically meaningful interpretation. To address these limitations, we introduce NeuroNarrator, the first generalist EEG-to-text foundation model designed to translate electrophysiological segments into precise clinical narratives. A cornerstone of this framework is the curation of NeuroCorpus-160K, the first harmonized largescale resource pairing over 160,000 EEG segments with structured, clinically grounded natural-language descriptions. Our architecture first aligns temporal EEG waveforms with spatial topographic maps via a rigorous contrastive objective, establishing spectro–spatially grounded representations. Building on this grounding, we condition a Large Language Model through a state-space–inspired formulation that integrates historical temporal and spectral context to support coherent clinical narrative generation. This approach establishes a principled bridge between continuous signal dynamics and discrete clinical language, enabling interpretable narrative generation that facilitates expert interpretation and supports clinical reporting workflows. Extensive evaluations across diverse benchmarks and zero-shot transfer tasks highlight NeuroNarrator’s capacity to integrate temporal, spectral, and spatial dynamics, positioning it as a foundational framework for time–frequency–aware, open-ended clinical interpretation of electrophysiological data.
Problem

Research questions and friction points this paper is trying to address.

EEG-to-text
clinical interpretation
generalist foundation model
spectro-spatial grounding
temporal reasoning
Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG-to-text
foundation model
spectro-spatial grounding
state-space reasoning
clinical narrative generation
Guoan Wang
Guoan Wang
Stevens Institute of Technology
General Medical AI
Shihao Yang
Shihao Yang
Assistant Professor, School of Industrial & Systems Engineering, Georgia Institute of Technology
Digital Disease DetectionElectronic Health RecordsMarkov Chain Monte CarloDynamic System InferenceFinancial Engineering
J
Jun-en Ding
Department of Systems Engineering, Stevens Institute of Technology, 1 Castle Point Terrace, Hoboken, 07030, New Jersey, USA.
H
Hao Zhu
Department of Systems Engineering, Stevens Institute of Technology, 1 Castle Point Terrace, Hoboken, 07030, New Jersey, USA.
Feng Liu
Feng Liu
Stevens Institute of Technology
EEG source imagingBrain NetworksDynamic SystemEpilepsyMental Disorder