🤖 AI Summary
This study addresses the end-to-end semantic decoding of auditory stimulus texts from fMRI signals. We propose, for the first time, modeling functional brain representations as learnable “brain prompts” that directly condition a pretrained large language model (GPT-2) for text generation—bypassing the conventional two-stage paradigm of feature mapping followed by language modeling. Methodologically, we introduce a cross-modal prompt alignment mechanism that jointly optimizes an fMRI encoder and textual prompts, incorporating a semantic-similarity-guided brain-text alignment loss and fine-tuning GPT-2 for end-to-end generation. Evaluated on a public auditory decoding dataset, our approach achieves average improvements of +4.61 in METEOR and +2.43 in BERTScore across all subjects, significantly surpassing existing state-of-the-art methods. These results validate the efficacy and advancement of brain prompt–driven LLMs for neural semantic decoding.
📝 Abstract
Decoding language information from brain signals represents a vital research area within brain-computer interfaces, particularly in the context of deciphering the semantic information from the fMRI signal. Although existing work uses LLM to achieve this goal, their method does not use an end-to-end approach and avoids the LLM in the mapping of fMRI-to-text, leaving space for the exploration of the LLM in auditory decoding. In this paper, we introduce a novel method, the Brain Prompt GPT (BP-GPT). By using the brain representation that is extracted from the fMRI as a prompt, our method can utilize GPT-2 to decode fMRI signals into stimulus text. Further, we introduce the text prompt and align the fMRI prompt to it. By introducing the text prompt, our BP-GPT can extract a more robust brain prompt and promote the decoding of pre-trained LLM. We evaluate our BP-GPT on the open-source auditory semantic decoding dataset and achieve a significant improvement up to 4.61 on METEOR and 2.43 on BERTScore across all the subjects compared to the state-of-the-art method. The experimental results demonstrate that using brain representation as a prompt to further drive LLM for auditory neural decoding is feasible and effective. The code is available at https://github.com/1994cxy/BP-GPT.