🤖 AI Summary
This work addresses the challenge of sentence-level language decoding from low signal-to-noise ratio electroencephalography (EEG) data, particularly the difficulty of surpassing random baselines under teacher-forcing–free inference. To tackle this, the authors propose a retrieval-augmented generation (RAG) framework for EEG-to-text translation that integrates a deep EEG encoder, semantic embedding alignment, vector retrieval, and a large language model (LLM). This approach represents the first successful fusion of neural signals with linguistic priors in a teacher-forcing–free setting. Evaluated on the ZuCo dataset across nine participants, the method achieves an average cosine similarity of 0.181 ± 0.022, yielding a statistically significant 30.45% improvement over the random baseline (p < 0.01) and outperforming existing approaches.
📝 Abstract
The decoding of linguistic information from electroencephalography (EEG) signals remains an extremely challenging problem in brain-computer interface (BCI) research. In particular, sentence-level decoding from EEG is difficult due to the low signal-to-noise ratio of these recordings. Previous studies tackling this problem have typically failed to surpass random baseline performance unless teacher forcing is used during the inference phase. In this work, we propose a retrieval-augmented generation (RAG)-based sentence-level EEG-to-text decoding pipeline that combines an EEG encoder aligned with semantic sentence embeddings, a vector retrieval stage, and a large language model (LLM) to refine retrieved sentences into coherent output. Experiments are conducted on the Zurich Cognitive Language Processing Corpus (ZuCo) dataset, which contains single-trial EEG recordings collected during silent reading. To evaluate whether the system extracts meaningful information from these EEG signals, the results are compared with a random baseline. In nine subjects, the proposed pipeline outperforms the random baseline, achieving a mean cosine similarity of 0.181 +- 0.022 compared to 0.139 +- 0.029 for the baseline, corresponding to a relative improvement of 30.45%. Statistical analysis further confirms that this improvement is significant, following a strict evaluation workflow where inference is performed without access to ground-truth labels.