๐ค AI Summary
AI systems deployed in safety-critical domains face mounting challenges regarding safety assurance, regulatory compliance (particularly under the EU AI Act), and societal acceptability.
Method: This study proposes the TรV AUSTRIA Trustworthy AI Framework, which operationalizes EU AI Act requirements into testable, auditable technical standards. It introduces the novel concept of โfunctional trustworthiness,โ integrating statistically defined application-domain boundaries, risk-informed performance metrics, and reproducible statistical validation on independent datasets across the AI lifecycle. A multidimensional auditing system is established, unifying secure software engineering practices, fairness assessment, robustness analysis, privacy protection, and distributional shift monitoring, culminating in an evolving, end-to-end audit catalogue.
Contribution/Results: The framework has been validated in real-world certification processes, providing regulators, developers, and users with a standardized, certifiable pathway for deploying AI systems compliant with European regulatory and technical expectations.
๐ Abstract
There is an increasing adoption of artificial intelligence in safety-critical applications, yet practical schemes for certifying that AI systems are safe, lawful and socially acceptable remain scarce. This white paper presents the TรV AUSTRIA Trusted AI framework an end-to-end audit catalog and methodology for assessing and certifying machine learning systems. The audit catalog has been in continuous development since 2019 in an ongoing collaboration with scientific partners. Building on three pillars - Secure Software Development, Functional Requirements, and Ethics & Data Privacy - the catalog translates the high-level obligations of the EU AI Act into specific, testable criteria. Its core concept of functional trustworthiness couples a statistically defined application domain with risk-based minimum performance requirements and statistical testing on independently sampled data, providing transparent and reproducible evidence of model quality in real-world settings. We provide an overview of the functional requirements that we assess, which are oriented on the lifecycle of an AI system. In addition, we share some lessons learned from the practical application of the audit catalog, highlighting common pitfalls we encountered, such as data leakage scenarios, inadequate domain definitions, neglect of biases, or a lack of distribution drift controls. We further discuss key aspects of certifying AI systems, such as robustness, algorithmic fairness, or post-certification requirements, outlining both our current conclusions and a roadmap for future research. In general, by aligning technical best practices with emerging European standards, the approach offers regulators, providers, and users a practical roadmap for legally compliant, functionally trustworthy, and certifiable AI systems.