Pretraining Large Brain Language Model for Active BCI: Silent Speech

📅 2025-04-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited naturalness and generalizability of silent speech decoding in active brain–computer interfaces (BCIs), this work introduces the first 120-hour, multi-subject silent-speech EEG dataset and proposes a Large Brain Language Model (LBLM) trained via a novel Future Spectro-Temporal Prediction (FSTP) self-supervised pretraining paradigm. FSTP uniquely models joint temporal–spectral dependencies of EEG signals in the time–frequency domain, enabling deep neural representation learning; it is specifically designed to integrate with language model architectures and support cross-session fine-tuning. Experimental results demonstrate state-of-the-art performance on cross-session semantic-level and word-level classification tasks, achieving accuracies of 47.0% (+5.4%) and 39.6% (+7.3%), respectively—significantly outperforming both fully supervised and existing pretrained baselines. This work establishes a scalable, robust framework for silent-speech BCI.

Technology Category

Application Category

📝 Abstract
This paper explores silent speech decoding in active brain-computer interface (BCI) systems, which offer more natural and flexible communication than traditional BCI applications. We collected a new silent speech dataset of over 120 hours of electroencephalogram (EEG) recordings from 12 subjects, capturing 24 commonly used English words for language model pretraining and decoding. Following the recent success of pretraining large models with self-supervised paradigms to enhance EEG classification performance, we propose Large Brain Language Model (LBLM) pretrained to decode silent speech for active BCI. To pretrain LBLM, we propose Future Spectro-Temporal Prediction (FSTP) pretraining paradigm to learn effective representations from unlabeled EEG data. Unlike existing EEG pretraining methods that mainly follow a masked-reconstruction paradigm, our proposed FSTP method employs autoregressive modeling in temporal and frequency domains to capture both temporal and spectral dependencies from EEG signals. After pretraining, we finetune our LBLM on downstream tasks, including word-level and semantic-level classification. Extensive experiments demonstrate significant performance gains of the LBLM over fully-supervised and pretrained baseline models. For instance, in the difficult cross-session setting, our model achieves 47.0% accuracy on semantic-level classification and 39.6% in word-level classification, outperforming baseline methods by 5.4% and 7.3%, respectively. Our research advances silent speech decoding in active BCI systems, offering an innovative solution for EEG language model pretraining and a new dataset for fundamental research.
Problem

Research questions and friction points this paper is trying to address.

Decoding silent speech in active brain-computer interface systems
Pretraining EEG language models for improved classification performance
Developing a new dataset and method for EEG signal representation learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Large Brain Language Model (LBLM) for silent speech decoding
Future Spectro-Temporal Prediction (FSTP) pretraining paradigm
Autoregressive modeling in temporal and frequency domains
🔎 Similar Papers
No similar papers found.
J
Jinzhao Zhou
Z
Zehong Cao
Yiqun Duan
Yiqun Duan
Meta | UTS | UBC
Vision & LanguageMulti-ModalityRoboticsBrain-Computer Interface
C
Connor Barkley
D
Daniel Leong
X
Xiaowei Jiang
Q
Quoc-Toan Nguyen
Ziyi Zhao
Ziyi Zhao
Amazon.com
Deep LearningComputer Vision
T
Thomas Do
Y
Yu-Cheng Chang
S
Sheng-Fu Liang
C
Chin-teng Lin