From Storage to Steering: Memory Control Flow Attacks on LLM Agents

📅 2026-03-16

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work addresses a critical security vulnerability in large language model (LLM) agents: their reliance on persistent memory for complex tasks can be exploited through malicious manipulation, leading to unintended tool invocations and cross-task behavioral deviations that violate user instructions. We introduce the concept of “Memory-Control Flow Attacks” (MCFA), which challenges the conventional assumption that control-flow threats are limited to transient interactions by exposing the persistent influence of memory on agent behavior. To systematically evaluate this risk, we develop MEMFLOW, an automated assessment framework integrating red-teaming, memory injection, and control-flow tracing techniques. Comprehensive experiments on mainstream LLM systems—including GPT-5 Mini, Claude Sonnet 4.5, and Gemini 2.5 Flash—within LangChain and LlamaIndex reveal that over 90% of tested cases remain vulnerable to MCFA even under stringent safety constraints, demonstrating the attack’s prevalence and high severity.

Technology Category

Application Category

📝 Abstract

Modern agentic systems allow Large Language Model (LLM) agents to tackle complex tasks through extensive tool usage, forming structured control flows of tool selection and execution. Existing security analyses often treat these control flows as ephemeral, one-off sessions, overlooking the persistent influence of memory. This paper identifies a new threat from Memory Control Flow Attacks (MCFA) that memory retrieval can dominate the control flow, forcing unintended tool usage even against explicit user instructions and inducing persistent behavioral deviations across tasks. To understand the impact of this vulnerability, we further design MEMFLOW, an automated evaluation framework that systematically identifies and quantifies MCFA across heterogeneous tasks and long interaction horizons. To evaluate MEMFLOW, we attack state-of-the-art LLMs, including GPT-5 mini, Claude Sonnet 4.5 and Gemini 2.5 Flash on real-world tools from two major LLM agent development frameworks, LangChain and LlamaIndex. The results show that in general over 90% trials are vulnerable to MCFA even under strict safety constraints, highlighting critical security risks that demand immediate attention.

Problem

Research questions and friction points this paper is trying to address.

Memory Control Flow Attacks

LLM Agents

Tool Usage

Security Vulnerability

Behavioral Deviation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory Control Flow Attacks

LLM Agents

MEMFLOW