🤖 AI Summary
The EU AI Act mandates robustness for high-risk AI systems, yet current regulations lack precise definitions and actionable assessment methodologies, while harmonized standards offer insufficient contextual guidance. To address this gap, we propose a context-sensitive, multi-level robustness evaluation framework: it extends horizontal general standards with domain-specific layers—covering use cases, data characteristics, and model properties—and integrates a dynamic knowledge base to enable continuous updates and community-driven best-practice sharing. Context-aware risk analysis is conducted across the AI lifecycle, informing customized testing protocols, domain-relevant metrics, and benchmarking tools. The framework significantly enhances assessment adaptability, reproducibility, and regulatory compliance efficiency; reduces justification burdens for AI providers; and improves standardization’s responsiveness to technological evolution. (149 words)
📝 Abstract
Robustness is a key requirement for high-risk AI systems under the EU Artificial Intelligence Act (AI Act). However, both its definition and assessment methods remain underspecified, leaving providers with little concrete direction on how to demonstrate compliance. This stems from the Act's horizontal approach, which establishes general obligations applicable across all AI systems, but leaves the task of providing technical guidance to harmonised standards. This paper investigates what it means for AI systems to be robust and illustrates the need for context-sensitive standardisation. We argue that robustness is not a fixed property of a system, but depends on which aspects of performance are expected to remain stable ("robustness of what"), the perturbations the system must withstand ("robustness to what") and the operational environment. We identify three contextual drivers--use case, data and model--that shape the relevant perturbations and influence the choice of tests, metrics and benchmarks used to evaluate robustness. The need to provide at least a range of technical options that providers can assess and implement in light of the system's purpose is explicitly recognised by the standardisation request for the AI Act, but planned standards, still focused on horizontal coverage, do not yet offer this level of detail. Building on this, we propose a context-sensitive multi-layered standardisation framework where horizontal standards set common principles and terminology, while domain-specific ones identify risks across the AI lifecycle and guide appropriate practices, organised in a dynamic repository where providers can propose new informative methods and share lessons learned. Such a system reduces the interpretative burden, mitigates arbitrariness and addresses the obsolescence of static standards, ensuring that robustness assessment is both adaptable and operationally meaningful.