π€ AI Summary
Large language model (LLM)-driven multi-agent systems (MASs) are vulnerable to false information injection attacks, leading to task failure and decision distortion. To address this, we propose ARGUSβa training-free, two-stage defense framework. Its core innovation lies in the first introduction of a target-aware reasoning mechanism to systematically model misinformation propagation dynamics within MASs. ARGUS integrates logic-guided goal alignment analysis, causal information-flow tracing, and context-sensitive misinformation localization and regeneration to enable precise detection and correction. Evaluated on MisinfoTask, a novel, real-world complex-task benchmark we curate, ARGUS reduces misinformation toxicity by 28.17% on average and improves task success rate by 10.33%. All code, models, and the MisinfoTask dataset are publicly released.
π Abstract
Large Language Model-based Multi-Agent Systems (MASs) have demonstrated strong advantages in addressing complex real-world tasks. However, due to the introduction of additional attack surfaces, MASs are particularly vulnerable to misinformation injection. To facilitate a deeper understanding of misinformation propagation dynamics within these systems, we introduce MisinfoTask, a novel dataset featuring complex, realistic tasks designed to evaluate MAS robustness against such threats. Building upon this, we propose ARGUS, a two-stage, training-free defense framework leveraging goal-aware reasoning for precise misinformation rectification within information flows. Our experiments demonstrate that in challenging misinformation scenarios, ARGUS exhibits significant efficacy across various injection attacks, achieving an average reduction in misinformation toxicity of approximately 28.17% and improving task success rates under attack by approximately 10.33%. Our code and dataset is available at: https://github.com/zhrli324/ARGUS.