🤖 AI Summary
This paper addresses existential AI risks—including value misalignment, ethical violations, emergency response failure, resource misuse, and societal negative externalities—by proposing a systematic governance framework. Methodologically, it introduces a dual-path “Five-Pillar, Six-Mechanism” architecture: first, distinguishing physical AI from generative AI to establish a digital–physical fault-line safety reinforcement mechanism grounded in their purely software-based or action-digitization nature; second, formalizing AI instruction sets (AIMs) to embed value alignment, legal compliance, ethical auditing, human–AI collaborative decision-making, resource isolation, and risk mitigation. The framework integrates intrinsic technical control with socio-technical governance, functioning as a foundational “braking system.” Its contribution lies in establishing a verifiable, non-bypassable control-theoretic foundation for AGI/ASI, constraining AI risks to the level of human operational error, and providing a deployable architectural paradigm for global AI regulation.
📝 Abstract
Artificial intelligence (AI) advances rapidly but achieving complete human control over AI risks remains an unsolved problem, akin to driving the fast AI "train" without a "brake system." By exploring fundamental control mechanisms at key elements of AI decisions, this paper develops a systematic solution to thoroughly control AI risks, providing an architecture for AI governance and legislation with five pillars supported by six control mechanisms, illustrated through a minimum set of AI Mandates (AIMs). Three of the AIMs must be built inside AI systems and three in society to address major areas of AI risks: 1) align AI values with human users; 2) constrain AI decision-actions by societal ethics, laws, and regulations; 3) build in human intervention options for emergencies and shut-off switches for existential threats; 4) limit AI access to resources to reinforce controls inside AI; 5) mitigate spillover risks like job loss from AI. We also highlight the differences in AI governance on physical AI systems versus generative AI. We discuss how to strengthen analog physical safeguards to prevent smarter AI/AGI/ASI from circumventing core safety controls by exploiting AI's intrinsic disconnect from the analog physical world: AI's nature as pure software code run on chips controlled by humans, and the prerequisite that all AI-driven physical actions must be digitized. These findings establish a theoretical foundation for AI governance and legislation as the basic structure of a "brake system" for AI decisions. If enacted, these controls can rein in AI dangers as completely as humanly possible, removing large chunks of currently wide-open AI risks, substantially reducing overall AI risks to residual human errors.