🤖 AI Summary
Foundational models face an inherent tension among privacy preservation, verifiability, and auditability.
Method: This paper proposes a trustworthy AI system architecture integrating cryptography and secure computation. It introduces, for the first time, a zero-knowledge proof–based mechanism for verifying AI behavioral assertions, synergistically combining secure multi-party computation (SMPC) and trusted execution environments (TEEs) to enable private yet auditable inference. Additionally, it designs an enhanced credential-based access control framework supporting decentralized identity and fine-grained policy enforcement.
Contribution: The work establishes the first deployment framework for large language models (LLMs) and information retrieval systems that simultaneously guarantees confidentiality, verifiability, and traceability. It delivers a technically viable blueprint—grounded in cryptographic primitives and hardware-enforced security—that directly supports real-world AI governance practices and informs regulatory policy development.
📝 Abstract
The growing societal reliance on artificial intelligence necessitates robust frameworks for ensuring its security, accountability, and trustworthiness. This thesis addresses the complex interplay between privacy, verifiability, and auditability in modern AI, particularly in foundation models. It argues that technical solutions that integrate these elements are critical for responsible AI innovation. Drawing from international policy contributions and technical research to identify key risks in the AI pipeline, this work introduces novel technical solutions for critical privacy and verifiability challenges. Specifically, the research introduces techniques for enabling verifiable and auditable claims about AI systems using zero-knowledge cryptography; utilizing secure multi-party computation and trusted execution environments for auditable, confidential deployment of large language models and information retrieval; and implementing enhanced delegation mechanisms, credentialing systems, and access controls to secure interactions with autonomous and multi-agent AI systems. Synthesizing these technical advancements, this dissertation presents a cohesive perspective on balancing privacy, verifiability, and auditability in foundation model-based AI systems, offering practical blueprints for system designers and informing policy discussions on AI safety and governance.