HCFT: Hierarchical Convolutional Fusion Transformer for EEG Decoding

πŸ“… 2026-01-18
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work proposes HCFT, a lightweight and general-purpose framework designed to effectively integrate the complex temporal, spectral, and spatial features inherent in multichannel EEG signals. HCFT combines a dual-branch convolutional encoder with a hierarchical Transformer architecture, leveraging cross-attention mechanisms to align local temporal and spatiotemporal features while modeling global dependencies through its hierarchical structure. A dynamic Tanh normalization strategy is further introduced to enhance training stability. Evaluated on the BCI Competition IV-2b and CHB-MIT datasets, HCFT achieves an accuracy of 80.83% (ΞΊ = 0.6165) and a sensitivity of 99.10% with a false alarm rate of 0.0236 per hour and specificity of 98.82%, respectively. These results significantly outperform over ten state-of-the-art methods, demonstrating HCFT’s superior decoding accuracy, generalization capability, and training efficiency.

Technology Category

Application Category

πŸ“ Abstract
Electroencephalography (EEG) decoding requires models that can effectively extract and integrate complex temporal, spectral, and spatial features from multichannel signals. To address this challenge, we propose a lightweight and generalizable decoding framework named Hierarchical Convolutional Fusion Transformer (HCFT), which combines dual-branch convolutional encoders and hierarchical Transformer blocks for multi-scale EEG representation learning. Specifically, the model first captures local temporal and spatiotemporal dynamics through time-domain and time-space convolutional branches, and then aligns these features via a cross-attention mechanism that enables interaction between branches at each stage. Subsequently, a hierarchical Transformer fusion structure is employed to encode global dependencies across all feature stages, while a customized Dynamic Tanh normalization module is introduced to replace traditional Layer Normalization in order to enhance training stability and reduce redundancy. Extensive experiments are conducted on two representative benchmark datasets, BCI Competition IV-2b and CHB-MIT, covering both event-related cross-subject classification and continuous seizure prediction tasks. Results show that HCFT achieves 80.83% average accuracy and a Cohen's kappa of 0.6165 on BCI IV-2b, as well as 99.10% sensitivity, 0.0236 false positives per hour, and 98.82% specificity on CHB-MIT, consistently outperforming over ten state-of-the-art baseline methods. Ablation studies confirm that each core component of the proposed framework contributes significantly to the overall decoding performance, demonstrating HCFT's effectiveness in capturing EEG dynamics and its potential for real-world BCI applications.
Problem

Research questions and friction points this paper is trying to address.

EEG decoding
temporal features
spectral features
spatial features
multichannel signals
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Transformer
Convolutional Fusion
Cross-Attention Mechanism
Dynamic Tanh Normalization
Multi-scale EEG Representation
πŸ”Ž Similar Papers