Aurchestra: Fine-Grained, Real-Time Soundscape Control on Resource-Constrained Hearables

📅 2026-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a lightweight, multi-output audio separation network designed for edge deployment, addressing the limitation of existing hearable devices that support only global noise suppression or single-target focus. By integrating 6-millisecond streaming processing, cross-platform model optimization, and a dynamically activated, on-demand interaction interface, the system enables real-time identification and fine-grained volume control of up to five concurrent sound classes on resource-constrained hardware. This approach transforms complex acoustic scenes into programmable multi-track audio streams, empowering users to remix their auditory environment akin to professional audio engineers. Evaluated in unseen real-world indoor and outdoor scenarios, the method demonstrates significantly improved target sound enhancement and interference suppression, while maintaining low latency and high robustness.

Technology Category

Application Category

📝 Abstract
Hearables are becoming ubiquitous, yet their sound controls remain blunt: users can either enable global noise suppression or focus on a single target sound. Real-world acoustic scenes, however, contain many simultaneous sources that users may want to adjust independently. We introduce Aurchestra, the first system to provide fine-grained, real-time soundscape control on resource-constrained hearables. Our system has two key components: (1) a dynamic interface that surfaces only active sound classes and (2) a real-time, on-device multi-output extraction network that generates separate streams for each selected class, achieving robust performance for upto 5 overlapping target sounds, and letting users mix their environment by customizing per-class volumes, much like an audio engineer mixes tracks. We optimize the model architecture for multiple compute-limited platforms and demonstrate real-time performance on 6 ms streaming audio chunks. Across real-world environments in previously unseen indoor and outdoor scenarios, our system enables expressive per-class sound control and achieves substantial improvements in target-class enhancement and interference suppression. Our results show that the world need not be heard as a single, undifferentiated stream: with Aurchestra, the soundscape becomes truly programmable.
Problem

Research questions and friction points this paper is trying to address.

hearables
soundscape control
fine-grained audio
real-time processing
multi-source sound
Innovation

Methods, ideas, or system contributions that make the work stand out.

fine-grained sound control
real-time audio separation
on-device multi-output network
resource-constrained hearables
programmable soundscape
🔎 Similar Papers
No similar papers found.
S
Seunghyun Oh
Paul G. Allen School of Computer Science and Engineering, University of Washington
Malek Itani
Malek Itani
University of Washington
mobile systemsembedded systemsaudio & speechmachine learningsmall-scale robotics
A
Aseem Gauri
Paul G. Allen School of Computer Science and Engineering, University of Washington
S
Shyamnath Gollakota
Paul G. Allen School of Computer Science and Engineering, University of Washington; Hearvana AI