🤖 AI Summary
This paper addresses the conceptual conflation of “safety” and “security” in AI research—causing disciplinary fragmentation, hindered interdisciplinary collaboration, and policy ambiguity. We propose the first systematic conceptual framework distinguishing these notions, grounded in rigorous conceptual analysis and cross-domain analogical modeling (drawing on information theory and structural engineering metaphors). The framework rigorously differentiates safety and security along three dimensions: types of risks (e.g., unintentional failure vs. adversarial exploitation), underlying failure mechanisms (e.g., design flaws vs. malicious manipulation), and mitigation objectives (e.g., robustness to uncertainty vs. resilience against attacks). Crucially, we identify and formalize their bidirectional interdependence: safety deficiencies can expand the attack surface, while security vulnerabilities may trigger non-malicious system failures. This framework clarifies research boundaries, enables precise problem scoping, facilitates cross-disciplinary modeling, and supports layered governance design—providing a foundational theoretical tool for trustworthy AI development, evaluation, and policymaking.
📝 Abstract
Artificial Intelligence (AI) is rapidly being integrated into critical systems across various domains, from healthcare to autonomous vehicles. While its integration brings immense benefits, it also introduces significant risks, including those arising from AI misuse. Within the discourse on managing these risks, the terms "AI Safety" and "AI Security" are often used, sometimes interchangeably, resulting in conceptual confusion. This paper aims to demystify the distinction and delineate the precise research boundaries between AI Safety and AI Security. We provide rigorous definitions, outline their respective research focuses, and explore their interdependency, including how security breaches can precipitate safety failures and vice versa. Using clear analogies from message transmission and building construction, we illustrate these distinctions. Clarifying these boundaries is crucial for guiding precise research directions, fostering effective cross-disciplinary collaboration, enhancing policy effectiveness, and ultimately, promoting the deployment of trustworthy AI systems.