Uneven Evolution of Cognition Across Generations of Generative AI Models

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This study proposes a psychometric framework integrating human normative data with a novel AIQ benchmark to systematically evaluate the cognitive evolution of generative AI en route to artificial general intelligence. By adapting cognitive tasks from the Wechsler Adult Intelligence Scale, the authors conduct cross-modal assessments across multiple generations of multimodal large language models, revealing pronounced asymmetries in their cognitive development: verbal comprehension and working memory exceed the 98th percentile of human performance, whereas perceptual reasoning falls below the 1st percentile. Abstract reasoning in linguistic form advances substantially faster than its visual counterpart, which remains nearly stagnant. The AIQ benchmark transcends the constraints of human norms, enabling, for the first time, quantitative tracking of AI cognitive trajectories.
📝 Abstract
The pursuit of artificial general intelligence necessitates robust methods for evaluating the cognitive capabilities of models beyond narrow task performance. Here, we introduce a psychometric framework to assess the cognitive profiles of generative AI, comparing them to human norms and tracking their evolution across generations. Initial evaluation of leading multimodal models using tasks adapted from the Wechsler Adult Intelligence Scale revealed a profoundly uneven cognitive architecture: near-ceiling performance in verbal comprehension and working memory (>$98^{\text{th}}$ percentile) contrasted with near-floor performance in perceptual reasoning (<$1^{\text{st}}$ percentile). To track developmental trajectories beyond human-normed limits, we developed the Artificial Intelligence Quotient (AIQ) Benchmark and applied it to six generations and two model families, revealing significant but asymmetric performance gains. Notably, we uncovered a sharp dissociation between modalities; abstract quantitative reasoning matured far more rapidly when presented linguistically compared to a visually analogous format, indicating an architectural bias towards language-based symbolic manipulation. While abstract visual reasoning improved, visual-perceptual organization remained largely stagnant. Collectively, these findings demonstrate that the cognitive abilities of generative models are evolving unevenly, suggesting that scaling and optimization approaches to AGI development alone may be insufficient to overcome fundamental architectural limitations in achieving balanced, human-like general intelligence.
Problem

Research questions and friction points this paper is trying to address.

cognitive evolution
generative AI
artificial general intelligence
modality dissociation
cognitive architecture
Innovation

Methods, ideas, or system contributions that make the work stand out.

psychometric framework
Artificial Intelligence Quotient (AIQ)
cognitive architecture
modality dissociation
uneven evolution
🔎 Similar Papers