🤖 AI Summary
This work addresses the absence of turn-level observability in existing autonomous information-gathering dialogue systems, which hinders real-time monitoring of information acquisition efficiency and detection of unproductive queries. The authors propose a Dialogue Telemetry (DT) framework that, after each interaction turn, generates two model-agnostic signals: a Progress Estimator (PE) quantifying remaining information potential and a Stagnation Index (SI) identifying repetitive, low-yield questioning. DT introduces, for the first time, an interpretable, turn-level stagnation detection mechanism that requires no causal diagnosis, integrating information-theoretic measures (in bits), semantic similarity, and marginal utility analysis to enable real-time quantification and intervention in dialogue efficiency. In simulated search-and-rescue scenarios, DT effectively discriminates between efficient and stagnant dialogues, and when incorporated into reinforcement learning policies, significantly enhances performance in settings with operational costs.
📝 Abstract
Autonomous systems conducting schema-grounded information-gathering dialogues face an instrumentation gap, lacking turn-level observables for monitoring acquisition efficiency and detecting when questioning becomes unproductive. We introduce Dialogue Telemetry (DT), a measurement framework that produces two model-agnostic signals after each question-answer exchange: (i) a Progress Estimator (PE) quantifying residual information potential per category (with a bits-based variant), and (ii) a Stalling Index (SI) detecting an observable failure signature characterized by repeated category probing with semantically similar, low-marginal-gain responses. SI flags this pattern without requiring causal diagnosis, supporting monitoring in settings where attributing degradation to specific causes may be impractical. We validate DT in controlled search-and-rescue (SAR)-inspired interviews using large language model (LLM)-based simulations, distinguishing efficient from stalled dialogue traces and illustrating downstream utility by integrating DT signals into a reinforcement learning (RL) policy. Across these settings, DT provides interpretable turn-level instrumentation that improves policy performance when stalling carries operational costs.