Towards Measurement Theory for Artificial Intelligence

📅 2025-07-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Current AI capability evaluation lacks a formal measurement foundation, suffers from cross-system and cross-method incomparability, and remains disconnected from quantitative risk analysis in engineering safety. To address these issues, this paper proposes a hierarchical AI measurement theory framework that rigorously distinguishes between direct and indirect observables and formally characterizes how AI capability definitions depend on specific measurement operations and scales. Methodologically, the framework integrates classical measurement theory, formal modeling, and quantitative risk analysis techniques, drawing upon established paradigms from engineering and safety science. Its core contribution is the first systematic, calibratable, and traceable taxonomy of AI phenomena and capability representations. This enables standardized, reproducible AI system evaluation—significantly enhancing the reliability and interoperability of assessment outcomes across scientific validation, engineering deployment, and regulatory decision-making.

Technology Category

Application Category

📝 Abstract

We motivate and outline a programme for a formal theory of measurement of artificial intelligence. We argue that formalising measurement for AI will allow researchers, practitioners, and regulators to: (i) make comparisons between systems and the evaluation methods applied to them; (ii) connect frontier AI evaluations with established quantitative risk analysis techniques drawn from engineering and safety science; and (iii) foreground how what counts as AI capability is contingent upon the measurement operations and scales we elect to use. We sketch a layered measurement stack, distinguish direct from indirect observables, and signpost how these ingredients provide a pathway toward a unified, calibratable taxonomy of AI phenomena.

Problem

Research questions and friction points this paper is trying to address.

Develop a formal theory for measuring artificial intelligence capabilities

Compare AI systems and evaluation methods using standardized metrics

Integrate AI assessments with engineering risk analysis techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Develops formal AI measurement theory framework

Connects AI evaluations with risk analysis

Proposes layered measurement stack approach

🔎 Similar Papers

No similar papers found.

Authors to Follow