Towards Agents That Know When They Don't Know: Uncertainty as a Control Signal for Structured Reasoning

📅 2025-09-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address low factual consistency and overconfidence in LLM-based agents performing multi-table biomedical reasoning, this paper proposes an uncertainty-aware structured reasoning framework. It innovatively introduces uncertainty as a dynamic control signal during retrieval and summarization: retrieval confidence is assessed via multi-round table-selection entropy, while summarization integrates self-consistency and perplexity to quantify uncertainty—thereby enabling uncertainty-driven inference termination, calibrated confidence expression, and controllable synthetic data generation. The method is integrated into the GRPO reinforcement learning framework, supporting inference-time filtering and uncertainty-aware data augmentation. Evaluated on multi-omics benchmarks, the approach triples the number of correct and useful statements in summaries, improves downstream survival prediction C-index from 0.32 to 0.63, and significantly reduces calibration error—demonstrating that explicit uncertainty modeling concurrently enhances factual reliability and controllability.

Technology Category

Application Category

📝 Abstract
Large language model (LLM) agents are increasingly deployed in structured biomedical data environments, yet they often produce fluent but overconfident outputs when reasoning over complex multi-table data. We introduce an uncertainty-aware agent for query-conditioned multi-table summarization that leverages two complementary signals: (i) retrieval uncertainty--entropy over multiple table-selection rollouts--and (ii) summary uncertainty--combining self-consistency and perplexity. Summary uncertainty is incorporated into reinforcement learning (RL) with Group Relative Policy Optimization (GRPO), while both retrieval and summary uncertainty guide inference-time filtering and support the construction of higher-quality synthetic datasets. On multi-omics benchmarks, our approach improves factuality and calibration, nearly tripling correct and useful claims per summary (3.0( ightarrow)8.4 internal; 3.6( ightarrow)9.9 cancer multi-omics) and substantially improving downstream survival prediction (C-index 0.32( ightarrow)0.63). These results demonstrate that uncertainty can serve as a control signal--enabling agents to abstain, communicate confidence, and become more reliable tools for complex structured-data environments.
Problem

Research questions and friction points this paper is trying to address.

Addressing overconfident outputs in LLM agents
Improving reasoning over multi-table biomedical data
Leveraging uncertainty signals for better calibration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging retrieval and summary uncertainty signals
Using Group Relative Policy Optimization for reinforcement learning
Employing uncertainty-guided inference-time filtering and dataset construction
🔎 Similar Papers
No similar papers found.