NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

📅 2026-02-24

🏛️ bioRxiv

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work proposes the first general-purpose EEG-to-text foundation model capable of generating clinically meaningful and interpretable narratives from electroencephalography (EEG) signals. Addressing the limitations of existing methods—which are often confined to specific tasks or coarse-grained recognition—the study introduces NeuroCorpus-160K, a standardized EEG–text paired corpus comprising 160,000 samples. The model integrates spectral-spatial contrastive learning, state-space temporal modeling, and conditional generation with large language models to jointly encode EEG topographic maps and time-series dynamics. Experimental results demonstrate that the framework effectively captures spatiotemporal–spectral dynamics, significantly improving the accuracy and interpretability of clinical narrative generation across multiple benchmarks and zero-shot transfer tasks.

Technology Category

Application Category

📝 Abstract

Electroencephalography (EEG) provides a non-invasive window into neural dynamics at high temporal resolution and plays a pivotal role in clinical neuroscience research. Despite this potential, prevailing computational approaches to EEG analysis remain largely confined to task-specific classification objectives or coarse-grained pattern recognition, offering limited support for clinically meaningful interpretation. To address these limitations, we introduce NeuroNarrator, the first generalist EEG-to-text foundation model designed to translate electrophysiological segments into precise clinical narratives. A cornerstone of this framework is the curation of NeuroCorpus-160K, the first harmonized largescale resource pairing over 160,000 EEG segments with structured, clinically grounded natural-language descriptions. Our architecture first aligns temporal EEG waveforms with spatial topographic maps via a rigorous contrastive objective, establishing spectro–spatially grounded representations. Building on this grounding, we condition a Large Language Model through a state-space–inspired formulation that integrates historical temporal and spectral context to support coherent clinical narrative generation. This approach establishes a principled bridge between continuous signal dynamics and discrete clinical language, enabling interpretable narrative generation that facilitates expert interpretation and supports clinical reporting workflows. Extensive evaluations across diverse benchmarks and zero-shot transfer tasks highlight NeuroNarrator’s capacity to integrate temporal, spectral, and spatial dynamics, positioning it as a foundational framework for time–frequency–aware, open-ended clinical interpretation of electrophysiological data.

Problem

Research questions and friction points this paper is trying to address.

EEG-to-text

clinical interpretation

generalist foundation model

spectro-spatial grounding

temporal reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG-to-text

foundation model

spectro-spatial grounding