HCFT: Hierarchical Convolutional Fusion Transformer for EEG Decoding

📅 2026-01-18

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work proposes HCFT, a lightweight and general-purpose framework designed to effectively integrate the complex temporal, spectral, and spatial features inherent in multichannel EEG signals. HCFT combines a dual-branch convolutional encoder with a hierarchical Transformer architecture, leveraging cross-attention mechanisms to align local temporal and spatiotemporal features while modeling global dependencies through its hierarchical structure. A dynamic Tanh normalization strategy is further introduced to enhance training stability. Evaluated on the BCI Competition IV-2b and CHB-MIT datasets, HCFT achieves an accuracy of 80.83% (κ = 0.6165) and a sensitivity of 99.10% with a false alarm rate of 0.0236 per hour and specificity of 98.82%, respectively. These results significantly outperform over ten state-of-the-art methods, demonstrating HCFT’s superior decoding accuracy, generalization capability, and training efficiency.

Technology Category

Application Category

📝 Abstract

Electroencephalography (EEG) decoding requires models that can effectively extract and integrate complex temporal, spectral, and spatial features from multichannel signals. To address this challenge, we propose a lightweight and generalizable decoding framework named Hierarchical Convolutional Fusion Transformer (HCFT), which combines dual-branch convolutional encoders and hierarchical Transformer blocks for multi-scale EEG representation learning. Specifically, the model first captures local temporal and spatiotemporal dynamics through time-domain and time-space convolutional branches, and then aligns these features via a cross-attention mechanism that enables interaction between branches at each stage. Subsequently, a hierarchical Transformer fusion structure is employed to encode global dependencies across all feature stages, while a customized Dynamic Tanh normalization module is introduced to replace traditional Layer Normalization in order to enhance training stability and reduce redundancy. Extensive experiments are conducted on two representative benchmark datasets, BCI Competition IV-2b and CHB-MIT, covering both event-related cross-subject classification and continuous seizure prediction tasks. Results show that HCFT achieves 80.83% average accuracy and a Cohen's kappa of 0.6165 on BCI IV-2b, as well as 99.10% sensitivity, 0.0236 false positives per hour, and 98.82% specificity on CHB-MIT, consistently outperforming over ten state-of-the-art baseline methods. Ablation studies confirm that each core component of the proposed framework contributes significantly to the overall decoding performance, demonstrating HCFT's effectiveness in capturing EEG dynamics and its potential for real-world BCI applications.

Problem

Research questions and friction points this paper is trying to address.

EEG decoding

temporal features

spectral features

spatial features

multichannel signals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Transformer

Convolutional Fusion

Cross-Attention Mechanism