๐ค AI Summary
This study investigates whether large language models (LLMs) exhibit human brainโlike mechanisms of information integration and how such mechanisms influence learning and behavior. Drawing on information decomposition theory, the authors conduct cross-architectural analyses, ablation studies, and comparisons between reinforcement learning and supervised fine-tuning. They reveal, for the first time, a high-cooperativity information processing core in intermediate layers of LLMs, whose organizational pattern closely resembles that of the human brain and emerges spontaneously during training. Ablating this cooperative region significantly impairs model performance, while targeted fine-tuning of this region yields substantially greater improvements than fine-tuning redundant regions, confirming its critical role in intelligent behavior. These findings further inform a novel, efficient fine-tuning strategy centered on cooperative neural substrates.
๐ Abstract
The independent evolution of intelligence in biological and artificial systems offers a unique opportunity to identify its fundamental computational principles. Here we show that large language models spontaneously develop synergistic cores -- components where information integration exceeds individual parts -- remarkably similar to those in the human brain. Using principles of information decomposition across multiple LLM model families and architectures, we find that areas in middle layers exhibit synergistic processing while early and late layers rely on redundancy, mirroring the informational organisation in biological brains. This organisation emerges through learning and is absent in randomly initialised networks. Crucially, ablating synergistic components causes disproportionate behavioural changes and performance loss, aligning with theoretical predictions about the fragility of synergy. Moreover, fine-tuning synergistic regions through reinforcement learning yields significantly greater performance gains than training redundant components, yet supervised fine-tuning shows no such advantage. This convergence suggests that synergistic information processing is a fundamental property of intelligence, providing targets for principled model design and testable predictions for biological intelligence.