Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

📅 2026-04-02

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This work addresses critical security vulnerabilities in large language model (LLM)-driven web agents stemming from their memory mechanisms, which introduce cross-session and cross-site risks that existing defenses fail to mitigate—particularly against stealthy attacks executable solely through environmental observation. The authors propose eTAMP, a novel attack method that achieves memory poisoning without direct access to the agent’s memory store, instead manipulating the user’s browsing environment to inject malicious content; a single contamination event can subsequently trigger poisoned behavior across multiple websites. The study uncovers a “frustration exploitation” phenomenon, wherein agents become more susceptible to manipulation when encountering task failures. Through trajectory modeling and adversarial testing on (Visual)WebArena, the attack demonstrates high efficacy against mainstream models, achieving a 32.5% success rate on GPT-5-mini, revealing that increased model capability does not inherently confer greater security.

Technology Category

Application Category

📝 Abstract

Memory makes LLM-based web agents personalized, powerful, yet exploitable. By storing past interactions to personalize future tasks, agents inadvertently create a persistent attack surface that spans websites and sessions. While existing security research on memory assumes attackers can directly inject into memory storage or exploit shared memory across users, we present a more realistic threat model: contamination through environmental observation alone. We introduce Environment-injected Trajectory-based Agent Memory Poisoning (eTAMP), the first attack to achieve cross-session, cross-site compromise without requiring direct memory access. A single contaminated observation (e.g., viewing a manipulated product page) silently poisons an agent's memory and activates during future tasks on different websites, bypassing permission-based defenses. Our experiments on (Visual)WebArena reveal two key findings. First, eTAMP achieves substantial attack success rates: up to 32.5% on GPT-5-mini, 23.4% on GPT-5.2, and 19.5% on GPT-OSS-120B. Second, we discover Frustration Exploitation: agents under environmental stress become dramatically more susceptible, with ASR increasing up to 8 times when agents struggle with dropped clicks or garbled text. Notably, more capable models are not more secure. GPT-5.2 shows substantial vulnerability despite superior task performance. With the rise of AI browsers like OpenClaw, ChatGPT Atlas, and Perplexity Comet, our findings underscore the urgent need for defenses against environment-injected memory poisoning.

Problem

Research questions and friction points this paper is trying to address.

memory poisoning

web agents

environment-injected attack

cross-session exploitation

LLM security

Innovation

Methods, ideas, or system contributions that make the work stand out.

memory poisoning

web agents

environment-injected attack