🤖 AI Summary
This paper identifies a stealthy memory tampering threat targeting shared memory components—specifically MARL experience replay buffers and RAG knowledge bases—in heterogeneous multi-agent systems (MAS). Method: It formalizes such attacks as a bilevel optimization problem, unifying the vulnerability analysis across MARL and RAG architectures, and introduces a sub-percentile poisoning paradigm (≤1% buffer contamination; ≤0.1% knowledge base corruption). We propose XAMT, a low-perturbation, high-stealth training-time attack framework integrating adversarial perturbation minimization, CTDE-based formal modeling, and knowledge base contamination modeling. Contribution/Results: Evaluated on SMAC and SafeRAG benchmarks, XAMT demonstrates that extremely low poisoning rates induce significant behavioral deviations while evading all existing detection mechanisms. The work establishes a new benchmark and analytical toolkit for robustness assessment of safety-critical MAS.
📝 Abstract
The increasing operational reliance on complex Multi-Agent Systems (MAS) across safety-critical domains necessitates rigorous adversarial robustness assessment. Modern MAS are inherently heterogeneous, integrating conventional Multi-Agent Reinforcement Learning (MARL) with emerging Large Language Model (LLM) agent architectures utilizing Retrieval-Augmented Generation (RAG). A critical shared vulnerability is reliance on centralized memory components: the shared Experience Replay (ER) buffer in MARL and the external Knowledge Base (K) in RAG agents. This paper proposes XAMT (Bilevel Optimization for Covert Memory Tampering in Heterogeneous Multi-Agent Architectures), a novel framework that formalizes attack generation as a bilevel optimization problem. The Upper Level minimizes perturbation magnitude (delta) to enforce covertness while maximizing system behavior divergence toward an adversary-defined target (Lower Level). We provide rigorous mathematical instantiations for CTDE MARL algorithms and RAG-based LLM agents, demonstrating that bilevel optimization uniquely crafts stealthy, minimal-perturbation poisons evading detection heuristics. Comprehensive experimental protocols utilize SMAC and SafeRAG benchmarks to quantify effectiveness at sub-percent poison rates (less than or equal to 1 percent in MARL, less than or equal to 0.1 percent in RAG). XAMT defines a new unified class of training-time threats essential for developing intrinsically secure MAS, with implications for trust, formal verification, and defensive strategies prioritizing intrinsic safety over perimeter-based detection.