Multi-class Decoding of Attended Speaker Direction Using Electroencephalogram and Audio Spatial Spectrum

📅 2024-11-11
📈 Citations: 0
Influential: 0
📄 PDF

career value

217K/year
🤖 AI Summary
This study addresses the challenge of fine-grained spatial direction discrimination (14 classes) in EEG signals for hearing-impaired individuals using brain–computer interfaces. Method: We propose the first multi-class (non-binary) spatial direction decoding framework, introducing a novel dual-modality co-modeling approach that jointly leverages EEG and audio spatial spectrograms—overcoming the generalization limitations of unimodal methods. Our architecture integrates time-frequency EEG feature extraction with spatial spectral representation learning, implementing CNN, LSM-CNN, and a newly designed Sp-EEG-Deformer model. Contribution/Results: Under a 1-second decision window, the framework achieves leave-one-subject-out and leave-one-trial-out accuracies of 55.35% and 57.19%, respectively—significantly surpassing unimodal baselines. Performance gains are most pronounced with fewer direction classes. This work establishes a new paradigm for high-precision spatial attention decoding in assistive neurotechnology.

Technology Category

Application Category

📝 Abstract
Decoding the directional focus of an attended speaker from listeners' electroencephalogram (EEG) signals is essential for developing brain-computer interfaces to improve the quality of life for individuals with hearing impairment. Previous works have concentrated on binary directional focus decoding, i.e., determining whether the attended speaker is on the left or right side of the listener. However, a more precise decoding of the exact direction of the attended speaker is necessary for effective speech processing. Additionally, audio spatial information has not been effectively leveraged, resulting in suboptimal decoding results. In this paper, it is found that on the recently presented dataset with 14-class directional focus, models relying exclusively on EEG inputs exhibit significantly lower accuracy when decoding the directional focus in both leave-one-subject-out and leave-one-trial-out scenarios. By integrating audio spatial spectra with EEG features, the decoding accuracy can be effectively improved. The CNN, LSM-CNN, and Deformer models are employed to decode the directional focus from listeners' EEG signals and audio spatial spectra. The proposed Sp-EEG-Deformer model achieves notable 14-class decoding accuracies of 55.35% and 57.19% in leave-one-subject-out and leave-one-trial-out scenarios with a decision window of 1 second, respectively. Experiment results indicate increased decoding accuracy as the number of alternative directions reduces. These findings suggest the efficacy of our proposed dual modal directional focus decoding strategy.
Problem

Research questions and friction points this paper is trying to address.

EEG-based speech direction recognition
improved accuracy
multiple specific directions
Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG
Spatial Audio Information
Speaker Direction Recognition
🔎 Similar Papers
No similar papers found.
Y
Yuanming Zhang
Key Lab of Modern Acoustics, Nanjing University, Nanjing 210093, China; NJU-Horizon Intelligent Audio Lab, Horizon Robotics, Beijing 100094, China
Jing Lu
Jing Lu
University of California, Santa Barbara
ElectronicsMOCVD material growth
Zhibin Lin
Zhibin Lin
Professor in Marketing, Durham University
MarketingConsumer BehaviourConsumer PsycholgoySocial MediaTourist Experience
F
Fei Chen
Department of Electrical and Electronic Engineering, Southern University of Science and Technology, Shenzhen 518055, China
H
Haoliang Du
Department of Otolaryngology Head and Neck Surgery, Nanjing Drum Tower Hospital, Jiangsu Provincial Key Medical Discipline (Laboratory), Nanjing University, Nanjing 210008, China
X
Xia Gao
Department of Otolaryngology Head and Neck Surgery, Nanjing Drum Tower Hospital, Jiangsu Provincial Key Medical Discipline (Laboratory), Nanjing University, Nanjing 210008, China