π€ AI Summary
Large learning systems often exhibit "jagged intelligence"βdemonstrating strong performance along certain capability dimensions while remaining fragile in others. This work models the training process as an anisotropic allocation of finite optimization energy in parameter space, revealing that such jaggedness arises from the non-uniform interplay among objective structure, data geometry, and representation coupling. We propose a verifiable theory of jagged intelligence, introducing formal notions of capability gain, energy share, and jaggedness, and establish fundamental lower bounds linking energy concentration to capability dispersion, alongside a finite-budget trade-off theorem. By modeling optimization energy distribution, applying energy-variance regularization, and incorporating explicit auxiliary objectives, one can effectively reshape the optimization landscape: theoretical analysis predicts that early energy concentration foreshadows later jaggedness, narrow objective expansion cannot eliminate anisotropy, and explicitly supported auxiliary tasks can recover neglected capabilities.
π Abstract
Artificial Jagged Intelligence (AJI) denotes a recurring pattern in which large learning systems exhibit strong local capabilities while remaining weak or brittle in other domains. This paper develops a formal theory of AJI as uneven allocation of optimization pressure. We model training as a finite-budget process that distributes gradient-driven update energy across capability-relevant directions in parameter space. In this model, jagged capability profiles arise from anisotropic objective structure, data geometry, and representational coupling rather than from a single scalar quantity called intelligence.
The paper defines capability gain, optimization energy share, and jaggedness, then proves that persistent concentration of cumulative update energy yields lower bounds on dispersion in capability gains. A finite-budget tradeoff theorem shows why prioritizing one capability can impose opportunity costs on others unless positive coupling or shared structure offsets the cost. The analysis also studies redistribution mechanisms, including energy-variance regularization and auxiliary structural objectives, as interventions that reshape the optimization field.
The resulting framework links uneven emergence, training architecture, and optimization governance. It predicts that early concentration of update energy should forecast later capability jaggedness; that scaling under a narrow objective need not eliminate anisotropy; and that explicitly funded auxiliary objectives can revive neglected capabilities. AJI is therefore not merely a descriptive label for uneven model behavior, but a testable theory of how finite optimization resources produce concentrated, delayed, and structurally uneven capability formation.