S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses EEG-driven auditory attention detection (AAD) in complex acoustic environments. We propose a brain-inspired spiking neural network (SNN) framework featuring a spike-driven symmetric dual-branch architecture that jointly models 1D token sequences and employs a token-channel mixer, augmented by a biologically inspired cross-modal feature fusion strategy to enable efficient collaborative learning of complementary EEG features. Compared with state-of-the-art methods, our model achieves new SOTA performance on three public AAD benchmarks while reducing parameter count by 14.7× and inference energy consumption by 5.8×. The framework thus strikes a significant balance between high decoding accuracy and ultra-low-power deployment—offering a practical, lightweight neural decoding paradigm for neuro-guided intelligent hearing aids.

Technology Category

Application Category

📝 Abstract
Auditory attention detection (AAD) aims to decode listeners' focus in complex auditory environments from electroencephalography (EEG) recordings, which is crucial for developing neuro-steered hearing devices. Despite recent advancements, EEG-based AAD remains hindered by the absence of synergistic frameworks that can fully leverage complementary EEG features under energy-efficiency constraints. We propose S$^2$M-Former, a novel spiking symmetric mixing framework to address this limitation through two key innovations: i) Presenting a spike-driven symmetric architecture composed of parallel spatial and frequency branches with mirrored modular design, leveraging biologically plausible token-channel mixers to enhance complementary learning across branches; ii) Introducing lightweight 1D token sequences to replace conventional 3D operations, reducing parameters by 14.7$ imes$. The brain-inspired spiking architecture further reduces power consumption, achieving a 5.8$ imes$ energy reduction compared to recent ANN methods, while also surpassing existing SNN baselines in terms of parameter efficiency and performance. Comprehensive experiments on three AAD benchmarks (KUL, DTU and AV-GC-AAD) across three settings (within-trial, cross-trial and cross-subject) demonstrate that S$^2$M-Former achieves comparable state-of-the-art (SOTA) decoding accuracy, making it a promising low-power, high-performance solution for AAD tasks.
Problem

Research questions and friction points this paper is trying to address.

Decoding auditory attention from EEG recordings efficiently
Leveraging complementary EEG features under energy constraints
Reducing power consumption while maintaining high performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spike-driven symmetric architecture with parallel branches
Lightweight 1D token sequences reduce parameters
Brain-inspired spiking cuts power consumption significantly
🔎 Similar Papers
No similar papers found.
J
Jiaqi Wang
Harbin Institute of Technology, Shenzhen
Zhengyu Ma
Zhengyu Ma
Pengcheng Laboratory
NeuroscienceNeural Network DynamicsComputational Physics
X
Xiongri Shen
Harbin Institute of Technology, Shenzhen
Chenlin Zhou
Chenlin Zhou
Peking University & Pengcheng Laboratory
Efficient Artificial IntelligenceBrain-inspired Computing
L
Leilei Zhao
Harbin Institute of Technology, Shenzhen
H
Han Zhang
Harbin Institute of Technology
Y
Yi Zhong
Harbin Institute of Technology, Shenzhen; Great Bay University
S
Siqi Cai
Harbin Institute of Technology, Shenzhen
Zhenxi Song
Zhenxi Song
Unknown affiliation
AI for NeuroscienceBrain-Computer InterfaceEEG/MRI Analysis
Z
Zhiguo Zhang
Harbin Institute of Technology, Shenzhen