Neural Speech Tracking in a Virtual Acoustic Environment: Audio-Visual Benefit for Unscripted Continuous Speech

📅 2025-01-14

📈 Citations: 0

✨ Influential: 0

career value

195K/year

🤖 AI Summary

How do audiovisual cues—particularly lip movements—enhance speech comprehension and cortical tracking of continuous speech under naturalistic, noisy, and reverberant conditions? Method: We recorded high-density EEG from participants listening to unscripted, ecologically valid speech in a virtual acoustic environment with simulated reverberation. Lip aperture was quantified from video, acoustic features (e.g., fundamental frequency, jitter) were extracted, and speech envelope–EEG correlations were computed to assess neural tracking fidelity. Contribution/Results: This study provides the first demonstration of audiovisual gain using untrained speakers, spontaneous speech, and high-ecological-validity virtual acoustics. Under noise, audiovisual integration significantly improved cortical speech tracking accuracy (p < 0.001); occluding lips abolished this benefit, reducing performance to auditory-only levels—confirming the necessity of visual articulatory cues in degraded listening. Critically, speaker-specific acoustic and visual characteristics emerged as key modulators of multimodal integration, highlighting inter-individual variability in audiovisual speech processing.

Technology Category

Application Category

📝 Abstract

The audio visual benefit in speech perception, where congruent visual input enhances auditory processing, is well documented across age groups, particularly in challenging listening conditions and among individuals with varying hearing abilities. However, most studies rely on highly controlled laboratory environments with scripted stimuli. Here, we examine the audio visual benefit using unscripted, natural speech from untrained speakers within a virtual acoustic environment. Using electroencephalography (EEG) and cortical speech tracking, we assessed neural responses across audio visual, audio only, visual only, and masked lip conditions to isolate the role of lip movements. Additionally, we analysed individual differences in acoustic and visual features of the speakers, including pitch, jitter, and lip openness, to explore their influence on the audio visual speech tracking benefit. Results showed a significant audio visual enhancement in speech tracking with background noise, with the masked lip condition performing similarly to the audio-only condition, emphasizing the importance of lip movements in adverse listening situations. Our findings reveal the feasibility of cortical speech tracking with naturalistic stimuli and underscore the impact of individual speaker characteristics on audio-visual integration in real world listening contexts.

Problem

Research questions and friction points this paper is trying to address.

Audio-Visual Integration

Speech Perception

Neural Processing

Innovation

Methods, ideas, or system contributions that make the work stand out.

EEG technology

audio-visual integration

natural speaking environment

🔎 Similar Papers

No similar papers found.