🤖 AI Summary
This work addresses the challenge of deploying generative AI in high-stakes decision-making, where hallucinated reasoning, unsupported claims, and weak traceability often preclude compliance with certification-grade accountability requirements. To bridge this gap, the authors propose a “compliance-by-construction” architecture that uniquely integrates typed argumentation graphs, retrieval-augmented generation (RAG), formal verification kernels, and W3C PROV-based provenance tracking. This framework ensures that every AI-generated claim is grounded in authoritative evidence and subjected to rigorous inference constraints before being admitted into official decision records. Empirical evaluation demonstrates that the architecture effectively blocks unsubstantiated assertions from entering the decision pipeline and substantially improves the efficiency of constructing compliant, auditable arguments, thereby enabling controlled, verifiable, and accountable use of generative AI in high-assurance settings.
📝 Abstract
High-stakes decision systems increasingly require structured justification, traceability, and auditability to ensure accountability and regulatory compliance. Formal arguments commonly used in the certification of safety-critical systems provide a mechanism for structuring claims, reasoning, and evidence in a verifiable manner. At the same time, generative artificial intelligence systems are increasingly integrated into decision-support workflows, assisting with drafting explanations, summarizing evidence, and generating recommendations. However, current deployments often rely on language models as loosely constrained assistants, which introduces risks such as hallucinated reasoning, unsupported claims, and weak traceability. This paper proposes a compliance-by-construction architecture that integrates Generative AI (GenAI) with structured formal argument representations. The approach treats each AI-assisted step as a claim that must be supported by verifiable evidence and validated against explicit reasoning constraints before it becomes part of an official decision record. The architecture combines four components: i) a typed Argument Graph representation inspired by assurance-case methods, ii) retrieval-augmented generation (RAG) to draft argument fragments grounded in authoritative evidence, iii) a reasoning and validation kernel enforcing completeness and admissibility constraints, and iv) a provenance ledger aligned with the W3C PROV standard to support auditability. We present a system design and an evaluation strategy based on enforceable invariants and worked examples. The analysis suggests that deterministic validation rules can prevent unsupported claims from entering the decision record while allowing GenAI to accelerate argument construction.