Intelligence Test

šŸ“… 2025-02-26
šŸ“ˆ Citations: 0
✨ Influential: 0
šŸ“„ PDF
šŸ¤– AI Summary
Quantifying intelligence in AI systems and identifying fundamental bottlenecks to autonomy remain unresolved challenges. Method: We propose an ā€œIntelligence Testingā€ framework that formalizes intelligence as *survival capacity*—in the sense of natural selection—and introduces a computable metric: the probability distribution (expectation and variance) of task failure counts. Lower failure rates and reduced variance indicate higher intelligence; finite variance signifies the emergence of autonomous intelligence. Contribution/Results: This work pioneers the translation of natural selection into a computable intelligence scale. Theoretically, it reveals criticality in human tasks and identifies superficial pattern imitation—rather than mechanistic understanding—as the core barrier to AI autonomy. Through probabilistic modeling, criticality analysis, multi-task empirical evaluation, and parameter extrapolation, we find mainstream AI systems fall significantly short of autonomous performance across vision, search, recommendation, and language tasks. Achieving general autonomy may require ~10²⁶ parameters—projected to take ~70 years under current Moore’s Law trends—necessitating a paradigm shift.

Technology Category

Application Category

šŸ“ Abstract
How does intelligence emerge? We propose that intelligence is not a sudden gift or random occurrence, but rather a necessary trait for species to survive through Natural Selection. If a species passes the test of Natural Selection, it demonstrates the intelligence to survive in nature. Extending this perspective, we introduce Intelligence Test, a method to quantify the intelligence of any subject on any task. Like how species evolve by trial and error, Intelligence Test quantifies intelligence by the number of failed attempts before success. Fewer failures correspond to higher intelligence. When the expectation and variance of failure counts are both finite, it signals the achievement of an autonomous level of intelligence. Using Intelligence Test, we comprehensively evaluate existing AI systems. Our results show that while AI systems achieve a level of autonomy in simple tasks, they are still far from autonomous in more complex tasks, such as vision, search, recommendation, and language. While scaling model size might help, this would come at an astronomical cost. Projections suggest that achieving general autonomy would require unimaginable $10^{26}$ parameters. Even if Moore's Law continuously holds, such a parameter scale would take $70$ years. This staggering cost highlights the complexity of human tasks and the inadequacies of current AI. To further understand this phenomenon, we conduct a theoretical analysis. Our simulations suggest that human tasks possess a criticality property. As a result, autonomy requires a deep understanding of the task's underlying mechanisms. Current AI, however, does not fully grasp these mechanisms and instead relies on superficial mimicry, making it difficult to reach an autonomous level. We believe Intelligence Test can not only guide the future development of AI but also offer profound insights into the intelligence of humans ourselves.
Problem

Research questions and friction points this paper is trying to address.

Quantify intelligence through failure counts
Evaluate AI systems autonomy levels
Understand criticality in human tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantify intelligence via failure counts
Evaluate AI autonomy across tasks
Simulate criticality in human tasks
šŸ”Ž Similar Papers