π€ AI Summary
This study addresses the current lack of effective metrics for assessing the extent of Artificial Intelligence Research and Development Automation (AIRDA) and its implications for AI capabilities, safety, and human oversight. It proposes the first systematic, operational framework of multidimensional monitoring indicators, integrating perspectives from economics, AI safety, and technology governance. Key dimensions include the share of capital investment in AI R&D, allocation of researchersβ time, and incidents of AI overreach. By combining indicator design, empirical data analysis, and policy-oriented insights, this work fills a critical gap in evidence-based evaluation. The resulting framework is designed for adoption by firms, nonprofit organizations, and governments to enable timely risk identification, inform safety protocols, and support robust AI governance and policymaking.
π Abstract
The automation of AI R&D (AIRDA) could have significant implications, but its extent and ultimate effects remain uncertain. We need empirical data to resolve these uncertainties, but existing data (primarily capability benchmarks) may not reflect real-world automation or capture its broader consequences, such as whether AIRDA accelerates capabilities more than safety progress or whether our ability to oversee AI R&D can keep pace with its acceleration. To address these gaps, this work proposes metrics to track the extent of AIRDA and its effects on AI progress and oversight. The metrics span dimensions such as capital share of AI R&D spending, researcher time allocation, and AI subversion incidents, and could help decision makers understand the potential consequences of AIRDA, implement appropriate safety measures, and maintain awareness of the pace of AI development. We recommend that companies and third parties (e.g. non-profit research organisations) start to track these metrics, and that governments support these efforts.