🤖 AI Summary
This work addresses the insufficient robustness and generalizability of cross-modal brain signal decoding in multimodal brain–computer interfaces (BCIs). To this end, we propose the first unified analytical framework specifically designed for multimodal BCI decoding. Methodologically, the framework systematically integrates heterogeneous neural data—spanning visual, speech, and affective modalities—and introduces a novel multimodal Transformer architecture that jointly incorporates cross-modal alignment, temporal modeling (via LSTM or Transformer layers), and adaptive fusion strategies. Compared to conventional unimodal or shallow-fusion approaches, our framework achieves significant improvements in decoding accuracy and cross-subject/cross-device generalization, while demonstrating strong scalability in modeling heterogeneous neurophysiological data. The study bridges computational neuroscience and modern AI paradigms, establishing both a methodological foundation and practical guidelines for developing next-generation robust, general-purpose multimodal BCIs.
📝 Abstract
Brain-computer interfaces (BCIs) enable direct communication between the brain and external devices. This review highlights the core decoding algorithms that enable multimodal BCIs, including a dissection of the elements, a unified view of diversified approaches, and a comprehensive analysis of the present state of the field. We emphasize algorithmic advancements in cross-modality mapping, sequential modeling, besides classic multi-modality fusion, illustrating how these novel AI approaches enhance decoding of brain data. The current literature of BCI applications on visual, speech, and affective decoding are comprehensively explored. Looking forward, we draw attention on the impact of emerging architectures like multimodal Transformers, and discuss challenges such as brain data heterogeneity and common errors. This review also serves as a bridge in this interdisciplinary field for experts with neuroscience background and experts that study AI, aiming to provide a comprehensive understanding for AI-powered multimodal BCIs.