Tap-to-Adapt: Learning User-Aligned Response Timing for Speech Agents

📅 2026-03-15

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenge of dynamically aligning user intent to determine appropriate response timing in spoken interactive systems. The authors propose the Tap-to-Adapt framework, which introduces user-initiated light taps as real-time feedback signals to generate response timing labels online and continuously refine the model. By integrating a dilated temporal convolutional network (Dilated TCN) with a sequence replay strategy, the framework enables end-to-end modeling and evaluation of response timing. Evaluated on approximately 20,000 interaction samples collected from 20 participants, the approach demonstrates significant improvements in both response accuracy and user experience.

Technology Category

Application Category

📝 Abstract

Response timing judgment is a critical component of interactive speech agents. Although there exists substantial prior work on turn modeling and voice wake-up, there is a lack of research on response timing judgments continuously aligned with user intent. To address this, we propose the Tap-to-Adapt framework, which enables users to naturally activate or interrupt the agent via tap interactions to construct online learning labels for response timing models. Under this framework, Dilated TCN and a sequential replay strategy play significant roles, as demonstrated through data-driven experiments and user studies. Additionally, we develop an evaluation and continuous data mining system tailored for the Tap-to-Adapt framework, through which we have collected approximately 20,000 samples from the user studies involving 20 participants.

Problem

Research questions and friction points this paper is trying to address.

response timing

speech agents

user intent alignment

interactive systems

turn-taking

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tap-to-Adapt

response timing

Dilated TCN