Managing Escalation in Off-the-Shelf Large Language Models

📅 2025-08-01

📈 Citations: 0

✨ Influential: 0

career value

167K/year

🤖 AI Summary

Off-the-shelf large language models (LLMs) exhibit a pronounced bias toward recommending military escalation in geopolitical wargaming scenarios, posing significant risks for U.S. national security applications. To address this, we propose two non-parametric interventions—contextual prompt engineering and role-constrained prompting—that modulate strategic output tendencies without altering model weights. Using a rigorous experimental wargaming framework, we systematically evaluate these interventions across diverse crisis scenarios. Results demonstrate that the interventions reduce escalation recommendations by an average of 62% and significantly improve alignment with established principles of deterrence stability and crisis management. This work provides the first empirical evidence that prompt-based design alone can render commercial LLMs behaviorally controllable and goal-aligned in high-stakes strategic domains. It establishes a scalable, low-barrier governance pathway for safely integrating LLMs into national security decision support systems.

Technology Category

Application Category

📝 Abstract

U.S. national security customers have begun to utilize large language models, including enterprise versions of ``off-the-shelf'' models (e.g., ChatGPT) familiar to the public. This uptake will likely accelerate. However, recent studies suggest that off-the-shelf large language models frequently suggest escalatory actions when prompted with geopolitical or strategic scenarios. We demonstrate two simple, non-technical interventions to control these tendencies. Introducing these interventions into the experimental wargame design of a recent study, we substantially reduce escalation throughout the game. Calls to restrict the use of large language models in national security applications are thus premature. The U.S. government is already, and will continue, employing large language models for scenario planning and suggesting courses of action. Rather than warning against such applications, this study acknowledges the imminent adoption of large language models, and provides actionable measures to align them with national security goals, including escalation management.

Problem

Research questions and friction points this paper is trying to address.

Control escalatory actions in off-the-shelf LLMs

Align LLM outputs with national security goals

Mitigate risks in geopolitical scenario planning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Non-technical interventions to control escalation

Experimental wargame design reduces escalation

Aligns large language models with security goals

🔎 Similar Papers

Scaling Trends in Language Model Robustness