Neural Speech Tracking in a Virtual Acoustic Environment: Audio-Visual Benefit for Unscripted Continuous Speech

📅 2025-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
How do audiovisual cues—particularly lip movements—enhance speech comprehension and cortical tracking of continuous speech under naturalistic, noisy, and reverberant conditions? Method: We recorded high-density EEG from participants listening to unscripted, ecologically valid speech in a virtual acoustic environment with simulated reverberation. Lip aperture was quantified from video, acoustic features (e.g., fundamental frequency, jitter) were extracted, and speech envelope–EEG correlations were computed to assess neural tracking fidelity. Contribution/Results: This study provides the first demonstration of audiovisual gain using untrained speakers, spontaneous speech, and high-ecological-validity virtual acoustics. Under noise, audiovisual integration significantly improved cortical speech tracking accuracy (p < 0.001); occluding lips abolished this benefit, reducing performance to auditory-only levels—confirming the necessity of visual articulatory cues in degraded listening. Critically, speaker-specific acoustic and visual characteristics emerged as key modulators of multimodal integration, highlighting inter-individual variability in audiovisual speech processing.

Technology Category

Application Category

📝 Abstract
The audio visual benefit in speech perception, where congruent visual input enhances auditory processing, is well documented across age groups, particularly in challenging listening conditions and among individuals with varying hearing abilities. However, most studies rely on highly controlled laboratory environments with scripted stimuli. Here, we examine the audio visual benefit using unscripted, natural speech from untrained speakers within a virtual acoustic environment. Using electroencephalography (EEG) and cortical speech tracking, we assessed neural responses across audio visual, audio only, visual only, and masked lip conditions to isolate the role of lip movements. Additionally, we analysed individual differences in acoustic and visual features of the speakers, including pitch, jitter, and lip openness, to explore their influence on the audio visual speech tracking benefit. Results showed a significant audio visual enhancement in speech tracking with background noise, with the masked lip condition performing similarly to the audio-only condition, emphasizing the importance of lip movements in adverse listening situations. Our findings reveal the feasibility of cortical speech tracking with naturalistic stimuli and underscore the impact of individual speaker characteristics on audio-visual integration in real world listening contexts.
Problem

Research questions and friction points this paper is trying to address.

Audio-Visual Integration
Speech Perception
Neural Processing
Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG technology
audio-visual integration
natural speaking environment
🔎 Similar Papers
No similar papers found.
Mareike Daeglau
Mareike Daeglau
University of Oldenburg
EEG Motor Imagery BCI Neurofeedback
J
Juergen Otten
Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Germany
G
G. Grimm
Department of Medical Physics and Acoustics, Carl von Ossietzky University of Oldenburg, Germany
B
B. Mirkovic
Neuropsychology Lab, Department of Psychology, Carl von Ossietzky University of Oldenburg, Germany
Volker Hohmann
Volker Hohmann
Carl von Ossietzky University of Oldenburg, Germany
Auditory Signal Processing
Stefan Debener
Stefan Debener
Uni Oldenburg
Cognitive Neuroscience