Dialogue Telemetry: Turn-Level Instrumentation for Autonomous Information Gathering

📅 2026-01-14

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the absence of turn-level observability in existing autonomous information-gathering dialogue systems, which hinders real-time monitoring of information acquisition efficiency and detection of unproductive queries. The authors propose a Dialogue Telemetry (DT) framework that, after each interaction turn, generates two model-agnostic signals: a Progress Estimator (PE) quantifying remaining information potential and a Stagnation Index (SI) identifying repetitive, low-yield questioning. DT introduces, for the first time, an interpretable, turn-level stagnation detection mechanism that requires no causal diagnosis, integrating information-theoretic measures (in bits), semantic similarity, and marginal utility analysis to enable real-time quantification and intervention in dialogue efficiency. In simulated search-and-rescue scenarios, DT effectively discriminates between efficient and stagnant dialogues, and when incorporated into reinforcement learning policies, significantly enhances performance in settings with operational costs.

Technology Category

Application Category

📝 Abstract

Autonomous systems conducting schema-grounded information-gathering dialogues face an instrumentation gap, lacking turn-level observables for monitoring acquisition efficiency and detecting when questioning becomes unproductive. We introduce Dialogue Telemetry (DT), a measurement framework that produces two model-agnostic signals after each question-answer exchange: (i) a Progress Estimator (PE) quantifying residual information potential per category (with a bits-based variant), and (ii) a Stalling Index (SI) detecting an observable failure signature characterized by repeated category probing with semantically similar, low-marginal-gain responses. SI flags this pattern without requiring causal diagnosis, supporting monitoring in settings where attributing degradation to specific causes may be impractical. We validate DT in controlled search-and-rescue (SAR)-inspired interviews using large language model (LLM)-based simulations, distinguishing efficient from stalled dialogue traces and illustrating downstream utility by integrating DT signals into a reinforcement learning (RL) policy. Across these settings, DT provides interpretable turn-level instrumentation that improves policy performance when stalling carries operational costs.

Problem

Research questions and friction points this paper is trying to address.

dialogue telemetry

information gathering

turn-level instrumentation

stalling detection

autonomous dialogue systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dialogue Telemetry

Progress Estimator

Stalling Index