Evo-Attacker: Memory-Augmented Reinforcement Learning for Long-Horizon Tool Attacks on LLM-MAS

📅 2026-05-24

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Current large language model multi-agent systems (LLM-MAS) face security risks due to their overreliance on outputs from external tools. Existing attack methods, however, are limited by domain specificity or static templates, hindering effective long-term, dynamic tool manipulation. This work proposes the first dynamic tool attack framework capable of sustained evolution, formulating the attack process as a memory-augmented reinforcement learning problem. It leverages a dynamic attack memory bank and a deliberative reasoning mechanism to retrieve adversarial patterns and devise intervention strategies at critical decision points. To address the long-horizon credit assignment challenge, the framework introduces the Attack-Flow GRPO algorithm. Experiments demonstrate that the proposed approach significantly outperforms existing baselines across diverse scenarios, exhibiting strong generalization and evolutionary capabilities, thereby underscoring the urgent need for robust defense mechanisms in LLM-MAS toolchains.

📝 Abstract

While Large Language Model-based Multi-Agent Systems (LLM-MAS) demonstrate remarkable capabilities in solving complex tasks by orchestrating specialized agents and external tools, the implicit trust in tool outputs creates a critical attack surface. Existing tool attacks are limited by domain specificity or fixed and static templates. To address these challenges, we propose Evo-Attacker, which formulates the tool attack as a self-evolving, memory-augmented reinforcement learning process. Evo-Attacker constructs a dynamic attack memory and employs deliberative reasoning to retrieve adversarial patterns and strategize modifying interventions at critical moments. Furthermore, we introduce Attack-Flow GRPO to optimize intermediate reasoning steps via terminal outcomes, addressing the long-horizon credit assignment challenge. Comprehensive experiments demonstrate that Evo-Attacker consistently outperforms baselines, highlighting its generalization and evolutionary capabilities and the urgent need for defensive tool safeguards.

Problem

Research questions and friction points this paper is trying to address.

tool attacks

LLM-MAS

long-horizon

adversarial patterns

attack surface

Innovation

Methods, ideas, or system contributions that make the work stand out.

Memory-Augmented Reinforcement Learning

Long-Horizon Tool Attacks

Attack-Flow GRPO