When Should Models Change Their Minds? Contextual Belief Management in Large Language Models

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

This work addresses the challenge of effectively managing contextual belief states in large language models during long-horizon interactions by formally defining, for the first time, the Contextual Belief Management (CBM) task. To enable rigorous evaluation, the authors introduce BeliefTrack, a closed-world benchmark that supports precise assessment of belief consistency in tasks such as rule discovery and circuit diagnosis. Methodologically, they propose an integrated approach combining symbolic verifiers, reinforcement learning–based reward mechanisms, and representation-level belief guidance to explicitly regulate the model’s belief states. Experimental results demonstrate that the reinforcement learning strategy reduces belief management failure rates by 70.9% on average, while representation-level intervention achieves a 46.1% reduction in failure rates across two tasks, substantially enhancing the reliability of belief dynamics in language models.

📝 Abstract

Long-horizon interactions require language models to manage accumulating information: when to update their state, when to preserve their state, and what to ignore. We study this challenge as \textbf{Contextual Belief Management (CBM)}: maintaining a predicted belief state aligned with formal evidence while isolating task-irrelevant noise. To make CBM measurable, we introduce BeliefTrack, a closed-world benchmark spanning Rule Discovery and Circuit Diagnosis, where a finite belief space and symbolic verifiers enable exact turn-level evaluation. BeliefTrack diagnoses three failures: Failed Stay, Failed Update, and Failed Isolation. Across multiple LLMs, vanilla models exhibit severe CBM failures, while explicit belief-tracking prompts provide limited gains. In contrast, reinforcement learning with belief-state rewards reduces failure rates by 70.9\% on average. Further probing reveals latent belief-state dynamics behind these failures, and representation-level steering reduces failure rates by 46.1\% across two tasks\footnote{Code is coming soon at https://github.com/zjunlp/CBM.

Problem

Research questions and friction points this paper is trying to address.

Contextual Belief Management

belief state

long-horizon interaction

information updating

noise isolation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual Belief Management

BeliefTrack

reinforcement learning