S$^2$M-Former: Spiking Symmetric Mixing Branchformer for Brain Auditory Attention Detection

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This work addresses EEG-driven auditory attention detection (AAD) in complex acoustic environments. We propose a brain-inspired spiking neural network (SNN) framework featuring a spike-driven symmetric dual-branch architecture that jointly models 1D token sequences and employs a token-channel mixer, augmented by a biologically inspired cross-modal feature fusion strategy to enable efficient collaborative learning of complementary EEG features. Compared with state-of-the-art methods, our model achieves new SOTA performance on three public AAD benchmarks while reducing parameter count by 14.7× and inference energy consumption by 5.8×. The framework thus strikes a significant balance between high decoding accuracy and ultra-low-power deployment—offering a practical, lightweight neural decoding paradigm for neuro-guided intelligent hearing aids.

Technology Category

Application Category

📝 Abstract

Auditory attention detection (AAD) aims to decode listeners' focus in complex auditory environments from electroencephalography (EEG) recordings, which is crucial for developing neuro-steered hearing devices. Despite recent advancements, EEG-based AAD remains hindered by the absence of synergistic frameworks that can fully leverage complementary EEG features under energy-efficiency constraints. We propose S$^2$M-Former, a novel spiking symmetric mixing framework to address this limitation through two key innovations: i) Presenting a spike-driven symmetric architecture composed of parallel spatial and frequency branches with mirrored modular design, leveraging biologically plausible token-channel mixers to enhance complementary learning across branches; ii) Introducing lightweight 1D token sequences to replace conventional 3D operations, reducing parameters by 14.7$ imes$. The brain-inspired spiking architecture further reduces power consumption, achieving a 5.8$ imes$ energy reduction compared to recent ANN methods, while also surpassing existing SNN baselines in terms of parameter efficiency and performance. Comprehensive experiments on three AAD benchmarks (KUL, DTU and AV-GC-AAD) across three settings (within-trial, cross-trial and cross-subject) demonstrate that S$^2$M-Former achieves comparable state-of-the-art (SOTA) decoding accuracy, making it a promising low-power, high-performance solution for AAD tasks.

Problem

Research questions and friction points this paper is trying to address.

Decoding auditory attention from EEG recordings efficiently

Leveraging complementary EEG features under energy constraints

Reducing power consumption while maintaining high performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Spike-driven symmetric architecture with parallel branches

Lightweight 1D token sequences reduce parameters

Brain-inspired spiking cuts power consumption significantly

🔎 Similar Papers

No similar papers found.