Proactive Hearing Assistants that Isolate Egocentric Conversations

📅 2025-11-14
🏛️ Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of automatic identification and separation of conversational partners in multi-speaker scenarios for hearing aids. We propose a prompt-free, on-device real-time active hearing assistance method that leverages the wearer’s self-speech as an acoustic anchor, models turn-taking behavior, and integrates binaural spatial cues. Our approach employs a lightweight streaming model for ultra-low-latency online processing, synergistically combined with a slower, context-aware model to capture long-term conversational dynamics—forming a dual-timescale architecture. Evaluated on 6.8 hours of real-world dyadic and triadic dialogue data from 11 participants, the method significantly improves target speech separation accuracy and speech intelligibility. It enables efficient on-device deployment and demonstrates robust generalization across diverse acoustic environments. To our knowledge, this is the first work achieving prompt-free conversational speech separation via self-speech anchoring and explicit turn-taking modeling.

Technology Category

Application Category

📝 Abstract
We introduce proactive hearing assistants that automatically identify and separate the wearer's conversation partners, without requiring explicit prompts. Our system operates on egocentric binaural audio and uses the wearer's self-speech as an anchor, leveraging turn-taking behavior and dialogue dynamics to infer conversational partners and suppress others. To enable real-time, on-device operation, we propose a dual-model architecture: a lightweight streaming model runs every 12.5 ms for low-latency extraction of the conversation partners, while a slower model runs less frequently to capture longer-range conversational dynamics. Results on real-world 2- and 3-speaker conversation test sets, collected with binaural egocentric hardware from 11 participants totaling 6.8 hours, show generalization in identifying and isolating conversational partners in multi-conversation settings. Our work marks a step toward hearing assistants that adapt proactively to conversational dynamics and engagement. More information can be found on our website: https://proactivehearing.cs.washington.edu/
Problem

Research questions and friction points this paper is trying to address.

Automatically identify and separate conversation partners without explicit prompts
Suppress background noise using turn-taking behavior and dialogue dynamics
Enable real-time on-device operation in multi-speaker conversation settings
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proactive hearing assistants isolate egocentric conversations automatically
System uses self-speech anchor and dialogue dynamics for partner inference
Dual-model architecture enables real-time on-device operation
G
Guilin Hu
Paul G. Allen School of Computer Science & Engineering, University of Washington
Malek Itani
Malek Itani
University of Washington
mobile systemsembedded systemsaudio & speechmachine learningsmall-scale robotics
Tuochao Chen
Tuochao Chen
University of Washington
Speech AI
S
Shyamnath Gollakota
Paul G. Allen School of Computer Science & Engineering, University of Washington