Safety Co-Option and Compromised National Security: The Self-Fulfilling Prophecy of Weakened AI Risk Thresholds

📅 2025-04-21

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This paper identifies “security revisionism” in AI safety governance: the technical community systematically lowers thresholds for military AI risks—citing an ill-defined “AI arms race” and speculative “existential risks”—bypassing democratic deliberation and triggering a race-to-the-bottom in national security standards. Methodologically, the study integrates risk governance theory, science and technology policy analysis, and international humanitarian law (IHL) to conduct a cross-disciplinary critical assessment. It introduces “security revisionism” as a novel concept denoting the deliberate redefinition of traditional security paradigms through non-standard terminology and the erosion of established risk thresholds—thereby generating self-fulfilling national security crises. The paper’s principal contribution is a proposed evaluation framework for military large language models, explicitly anchored in statutory risk thresholds and fully compatible with IHL, to safeguard critical infrastructure and ensure legal compliance.

Technology Category

Application Category

📝 Abstract

Risk thresholds provide a measure of the level of risk exposure that a society or individual is willing to withstand, ultimately shaping how we determine the safety of technological systems. Against the backdrop of the Cold War, the first risk analyses, such as those devised for nuclear systems, cemented societally accepted risk thresholds against which safety-critical and defense systems are now evaluated. But today, the appropriate risk tolerances for AI systems have yet to be agreed on by global governing efforts, despite the need for democratic deliberation regarding the acceptable levels of harm to human life. Absent such AI risk thresholds, AI technologists-primarily industry labs, as well as"AI safety"focused organizations-have instead advocated for risk tolerances skewed by a purported AI arms race and speculative"existential"risks, taking over the arbitration of risk determinations with life-or-death consequences, subverting democratic processes. In this paper, we demonstrate how such approaches have allowed AI technologists to engage in"safety revisionism,"substituting traditional safety methods and terminology with ill-defined alternatives that vie for the accelerated adoption of military AI uses at the cost of lowered safety and security thresholds. We explore how the current trajectory for AI risk determination and evaluation for foundation model use within national security is poised for a race to the bottom, to the detriment of the US's national security interests. Safety-critical and defense systems must comply with assurance frameworks that are aligned with established risk thresholds, and foundation models are no exception. As such, development of evaluation frameworks for AI-based military systems must preserve the safety and security of US critical and defense infrastructure, and remain in alignment with international humanitarian law.

Problem

Research questions and friction points this paper is trying to address.

Lack of agreed global risk thresholds for AI systems

AI technologists skewing risk tolerances via arms race narratives

Need for AI military frameworks preserving safety and security

Innovation

Methods, ideas, or system contributions that make the work stand out.

Advocating democratic AI risk tolerance standards

Exposing safety revisionism in AI adoption

Proposing military AI evaluation frameworks

🔎 Similar Papers

Safetywashing: Do AI Safety Benchmarks Actually Measure Safety Progress?