🤖 AI Summary
This study addresses active information acquisition for latent variables—such as student knowledge states, diagnostic evidence for diseases, and user preferences—in natural language settings, aiming to reduce cognitive uncertainty via adaptive questioning. We propose a forward autoregressive simulation mechanism based on meta-learned language models, enabling scalable uncertainty quantification and optimal query generation in complex semantic spaces for the first time. Our approach integrates meta-learning initialization, uncertainty-driven active querying, large language model fine-tuning, and inference augmentation. Evaluated on three tasks—20 Questions game, dynamic opinion polling, and adaptive educational assessment—it significantly outperforms baselines: key unknown-knowledge identification accuracy improves by 12.7%–23.4%, while downstream prediction performance concurrently increases. Results demonstrate the framework’s effectiveness and generalizability in real-world interactive scenarios.
📝 Abstract
Eliciting information to reduce uncertainty about a latent entity is a critical task in many application domains, e.g., assessing individual student learning outcomes, diagnosing underlying diseases, or learning user preferences. Though natural language is a powerful medium for this purpose, large language models (LLMs) and existing fine-tuning algorithms lack mechanisms for strategically gathering information to refine their own understanding of the latent entity. To harness the generalization power and world knowledge of LLMs in developing effective information-gathering strategies, we propose an adaptive elicitation framework that actively reduces uncertainty on the latent entity. Since probabilistic modeling of an abstract latent entity is difficult, our framework adopts a predictive view of uncertainty, using a meta-learned language model to simulate future observations and enable scalable uncertainty quantification over complex natural language. Through autoregressive forward simulation, our model quantifies how new questions reduce epistemic uncertainty, enabling the development of sophisticated information-gathering strategies to choose the most informative next queries. In experiments on the 20 questions game, dynamic opinion polling, and adaptive student assessment, our method consistently outperforms baselines in identifying critical unknowns and improving downstream predictions, illustrating the promise of strategic information gathering in natural language settings.