Toward IIT-Inspired Consciousness in LLMs: A Reward-Based Learning Framework

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a consciousness-inspired guidance mechanism grounded in Integrated Information Theory (IIT) to advance artificial general intelligence. By translating core IIT principles into optimizable reward signals, the method employs reinforcement learning to post-train large language models without relying on external data or auxiliary models. This approach enhances the causal structure, coherence, and information integration of generated text. Empirical results demonstrate up to a 31% reduction in output length on out-of-domain tasks while maintaining baseline-level accuracy, alongside significant improvements in model confidence calibration and reasoning efficiency. To the best of our knowledge, this study presents the first end-to-end optimization framework that operationalizes IIT within large language models.

Technology Category

Application Category

📝 Abstract
The pursuit of Artificial General Intelligence (AGI) is a central goal in language model development, in which consciousness-like processing could serve as a key facilitator. While current language models are not conscious, they exhibit behaviors analogous to certain aspects of consciousness. This paper investigates the implementation of a leading theory of consciousness, Integrated Information Theory (IIT), within language models via a reward-based learning paradigm. IIT provides a formal, axiom-based mathematical framework for quantifying consciousness. Drawing inspiration from its core principles, we formulate a novel reward function that quantifies a text's causality, coherence and integration, characteristics associated with conscious processing. Empirically, it is found that optimizing for this IIT-inspired reward leads to more concise text generation. On out of domain tasks, careful tuning achieves up to a 31% reduction in output length while preserving accuracy levels comparable to the base model. In addition to primary task performance, the broader effects of this training methodology on the model's confidence calibration and test-time computational scaling is analyzed. The proposed framework offers significant practical advantages: it is conceptually simple, computationally efficient, requires no external data or auxiliary models, and leverages a general, capability-driven signal rather than task-specific heuristics. Code available at https://github.com/MH-Sameti/LLM_PostTraining.git
Problem

Research questions and friction points this paper is trying to address.

Consciousness
Integrated Information Theory
Large Language Models
Artificial General Intelligence
Reward-Based Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrated Information Theory
reward-based learning
consciousness-inspired AI
text conciseness
large language models
🔎 Similar Papers
No similar papers found.
H
Hamid Reza Akbari
Sharif University of Technology, Tehran, Iran
M
Mohammad Hossein Sameti
Sharif University of Technology, Tehran, Iran
Amir M. Mansourian
Amir M. Mansourian
Master student, Sharif University of Technology
Computer VisionMachine Learning
M
M. H. Rohban
Sharif University of Technology, Tehran, Iran
Hossein Sameti
Hossein Sameti
Associate Professor, Sharif University of Technology
Speech Recognition and synthesisSpoken Dialogue systems