🤖 AI Summary
To address the low decoding accuracy and poor generalizability of motor imagery (MI) brain–computer interfaces caused by non-stationarity and low signal-to-noise ratio in EEG signals, this paper proposes an end-to-end discriminative autoencoder–spatiotemporal graph neural network (DAE-STGNN) fusion framework. The method innovatively integrates a residual dense convolutional autoencoder with a learnable spatiotemporal graph structure, jointly optimizing signal reconstruction and classification objectives to explicitly model dynamic spatial functional connectivity and temporal evolution patterns. Evaluated on three public datasets, the framework achieves a mean classification accuracy of 94.36%, demonstrating strong cross-subject and cross-task generalizability. With moderate parameter count and ultra-low latency—0.32 ms per sample inference—it balances high accuracy, real-time efficiency, and physiological interpretability.
📝 Abstract
Motor imagery (MI) based brain-computer interfaces (BCIs) hold significant potential for assistive technologies and neurorehabilitation. However, the precise and efficient decoding of MI remains challenging due to their non-stationary nature and low signal-to-noise ratio. This paper introduces a novel end-to-end deep learning framework of Discriminative Residual Dense Convolutional Autoencoder with Spatio-Temporal Graph Neural Network (DRDCAE-STGNN) to enhance the MI feature learning and classification. Specifically, the DRDCAE module leverages residual-dense connections to learn discriminative latent representations through joint reconstruction and classifica-tion, while the STGNN module captures dynamic spatial dependencies via a learnable graph adjacency matrix and models temporal dynamics using bidirectional long short-term memory (LSTM). Extensive evaluations on BCI Competition IV 2a, 2b, and PhysioNet datasets demonstrate state-of-the-art performance, with average accuracies of 95.42%, 97.51%, and 90.15%, respectively. Ablation studies confirm the contribution of each component, and interpreta-bility analysis reveals neurophysiologically meaningful connectivity patterns. Moreover, despite its complexity, the model maintains a feasible parameter count and an inference time of 0.32 ms per sample. These results indicate that our method offers a robust, accurate, and interpretable solution for MI-EEG decoding, with strong generalizability across subjects and tasks and meeting the requirements for potential real-time BCI applications.