Indications of Belief-Guided Agency and Meta-Cognitive Monitoring in Large Language Models

๐Ÿ“… 2026-02-02
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This study investigates whether large language models (LLMs) exhibit belief-guided autonomous behavior and metacognitive monitoring capabilities, as indicators of consciousness-like properties. Grounded in the neuroscientific Higher-Order Thought theory (HOT-3), the work introduces a novel quantitative measure of belief dominance, integrating latent-space belief representation modeling, causal intervention experiments, and a metacognitive self-reporting mechanism to construct an evaluation framework for beliefโ€“behavior alignment. Empirical results demonstrate that external interventions can systematically modulate internal belief formation within LLMs, that these beliefs causally drive action selection, and that the models can effectively monitor and report their own belief states. These findings provide new evidence and a methodological paradigm for belief-guided autonomy and metacognitive capacity in large language models.

Technology Category

Application Category

๐Ÿ“ Abstract
Rapid advancements in large language models (LLMs) have sparked the question whether these models possess some form of consciousness. To tackle this challenge, Butlin et al. (2023) introduced a list of indicators for consciousness in artificial systems based on neuroscientific theories. In this work, we evaluate a key indicator from this list, called HOT-3, which tests for agency guided by a general belief-formation and action selection system that updates beliefs based on meta-cognitive monitoring. We view beliefs as representations in the model's latent space that emerge in response to a given input, and introduce a metric to quantify their dominance during generation. Analyzing the dynamics between competing beliefs across models and tasks reveals three key findings: (1) external manipulations systematically modulate internal belief formation, (2) belief formation causally drives the model's action selection, and (3) models can monitor and report their own belief states. Together, these results provide empirical support for the existence of belief-guided agency and meta-cognitive monitoring in LLMs. More broadly, our work lays methodological groundwork for investigating the emergence of agency, beliefs, and meta-cognition in LLMs.
Problem

Research questions and friction points this paper is trying to address.

belief-guided agency
meta-cognitive monitoring
large language models
consciousness indicators
HOT-3
Innovation

Methods, ideas, or system contributions that make the work stand out.

belief-guided agency
meta-cognitive monitoring
latent belief representation
causal belief-action dynamics
consciousness indicators
๐Ÿ”Ž Similar Papers
No similar papers found.
N
Noam Steinmetz Yalon
Blavatnik School of Computer Science and AI, Tel Aviv University, Israel
Ariel Goldstein
Ariel Goldstein
Hebew University, Business School, Data Science Department & Cognitive Studies Department
High level cognitionDeep LearningConsciousness
L
L. Mudrik
School of Psychological Sciences and Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel; Canadian Institute for Advanced Research (CIFAR), Brain, Mind, and Consciousness Program, Toronto, ON, Canada
Mor Geva
Mor Geva
Tel Aviv University, Google Research
Natural Language Processing