Aurchestra: Fine-Grained, Real-Time Soundscape Control on Resource-Constrained Hearables

📅 2026-02-27

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work proposes a lightweight, multi-output audio separation network designed for edge deployment, addressing the limitation of existing hearable devices that support only global noise suppression or single-target focus. By integrating 6-millisecond streaming processing, cross-platform model optimization, and a dynamically activated, on-demand interaction interface, the system enables real-time identification and fine-grained volume control of up to five concurrent sound classes on resource-constrained hardware. This approach transforms complex acoustic scenes into programmable multi-track audio streams, empowering users to remix their auditory environment akin to professional audio engineers. Evaluated in unseen real-world indoor and outdoor scenarios, the method demonstrates significantly improved target sound enhancement and interference suppression, while maintaining low latency and high robustness.

Technology Category

Application Category

📝 Abstract

Hearables are becoming ubiquitous, yet their sound controls remain blunt: users can either enable global noise suppression or focus on a single target sound. Real-world acoustic scenes, however, contain many simultaneous sources that users may want to adjust independently. We introduce Aurchestra, the first system to provide fine-grained, real-time soundscape control on resource-constrained hearables. Our system has two key components: (1) a dynamic interface that surfaces only active sound classes and (2) a real-time, on-device multi-output extraction network that generates separate streams for each selected class, achieving robust performance for upto 5 overlapping target sounds, and letting users mix their environment by customizing per-class volumes, much like an audio engineer mixes tracks. We optimize the model architecture for multiple compute-limited platforms and demonstrate real-time performance on 6 ms streaming audio chunks. Across real-world environments in previously unseen indoor and outdoor scenarios, our system enables expressive per-class sound control and achieves substantial improvements in target-class enhancement and interference suppression. Our results show that the world need not be heard as a single, undifferentiated stream: with Aurchestra, the soundscape becomes truly programmable.

Problem

Research questions and friction points this paper is trying to address.

hearables

soundscape control

fine-grained audio

real-time processing

multi-source sound

Innovation

Methods, ideas, or system contributions that make the work stand out.

fine-grained sound control

real-time audio separation

on-device multi-output network