Towards Measurement Theory for Artificial Intelligence

📅 2025-07-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current AI capability evaluation lacks a formal measurement foundation, suffers from cross-system and cross-method incomparability, and remains disconnected from quantitative risk analysis in engineering safety. To address these issues, this paper proposes a hierarchical AI measurement theory framework that rigorously distinguishes between direct and indirect observables and formally characterizes how AI capability definitions depend on specific measurement operations and scales. Methodologically, the framework integrates classical measurement theory, formal modeling, and quantitative risk analysis techniques, drawing upon established paradigms from engineering and safety science. Its core contribution is the first systematic, calibratable, and traceable taxonomy of AI phenomena and capability representations. This enables standardized, reproducible AI system evaluation—significantly enhancing the reliability and interoperability of assessment outcomes across scientific validation, engineering deployment, and regulatory decision-making.

Technology Category

Application Category

📝 Abstract
We motivate and outline a programme for a formal theory of measurement of artificial intelligence. We argue that formalising measurement for AI will allow researchers, practitioners, and regulators to: (i) make comparisons between systems and the evaluation methods applied to them; (ii) connect frontier AI evaluations with established quantitative risk analysis techniques drawn from engineering and safety science; and (iii) foreground how what counts as AI capability is contingent upon the measurement operations and scales we elect to use. We sketch a layered measurement stack, distinguish direct from indirect observables, and signpost how these ingredients provide a pathway toward a unified, calibratable taxonomy of AI phenomena.
Problem

Research questions and friction points this paper is trying to address.

Develop a formal theory for measuring artificial intelligence capabilities
Compare AI systems and evaluation methods using standardized metrics
Integrate AI assessments with engineering risk analysis techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops formal AI measurement theory framework
Connects AI evaluations with risk analysis
Proposes layered measurement stack approach
🔎 Similar Papers
No similar papers found.