CoPlay: Audio-agnostic Cognitive Scaling for Acoustic Sensing

📅 2024-03-16

🏛️ International Conference on Computer Communications and Networks

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address speaker overload caused by signal superposition when simultaneously performing audio playback and ultrasound sensing on smart devices, this paper proposes an audio-agnostic cognitive scaling mechanism: dynamically embedding sensing signals into the residual bandwidth of music—without clipping or global amplitude reduction—to enable high-fidelity, real-time acoustic sensing while preserving audio fidelity. The method employs a lightweight deep learning model supporting both sinusoidal and FMCW sensing waveforms, is compatible with arbitrary concurrent audio streams, and is deployable on edge devices. Experiments demonstrate that respiration monitoring and gesture recognition achieve accuracies approaching interference-free baselines. A user study confirms no perceptible degradation in audio quality, significantly outperforming existing approaches. This work presents the first solution enabling collaborative, adaptive allocation of frequency-domain resources between sensing and playback.

Technology Category

Application Category

📝 Abstract

Acoustic sensing manifests great potential in various applications like health monitoring, gesture interface, by utilizing built-in speakers and microphones on smart devices. However, in ongoing research and development, one problem is often overlooked: the same speaker, when used concurrently for sensing and other traditional audio tasks (like playing music), could cause interference in both, making it impractical to use. The strong ultrasonic sensing signals mixed with music would overload the speaker’s mixer. To confront this issue of overloaded signals, current solutions are clipping or down-scaling, both of which affect the music playback quality, sensing range, and accuracy. To address this challenge, we propose CoPlay, a deep learning-based optimization algorithm to cognitively adapt the sensing signal and run in real-time. It can 1) maximize the sensing signal magnitude within the available bandwidth left by the concurrent music to optimize sensing range and accuracy and 2) minimize any consequential frequency distortion that can affect music playback. We design a custom model and test it on common types of sensing signals (sine wave or Frequency Modulated Continuous Wave FMCW) as inputs alongside various agnostic types of concurrent music and speech. First, we micro-benchmark the model performance to show the quality of the generated signals. Secondly, we conducted 2 field studies of downstream acoustic sensing tasks on 2 devices in the real world. A study with 12 users proved that respiration monitoring and gesture recognition using our adapted signal achieve similar accuracy as no-concurrent-music scenarios, whereas baseline methods of clipping or down-scaling manifest worse accuracy. A qualitative study also justifies that CoPlay leaves music untouched, unlike clipping or down-scaling that degrade music quality.

Problem

Research questions and friction points this paper is trying to address.

Optimizing acoustic sensing signals during concurrent music playback

Minimizing frequency distortion to maintain music quality

Maximizing sensing range and accuracy without signal interference

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning optimizes sensing signal adaptation

Maximizes sensing signal within music bandwidth

Minimizes frequency distortion for music quality

🔎 Similar Papers

Compositional Audio Representation Learning