From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration

๐Ÿ“… 2026-03-04
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

236K/year
๐Ÿค– AI Summary
This work addresses the vulnerability of large language model (LLM) multi-agent collaboration to error propagation, where minor mistakes can cascade through message dependencies, leading to system-wide erroneous consensus that is difficult to trace. The study introduces the first error propagation dynamics model tailored to this setting, abstracting collaborative interactions as a directed dependency graph and identifying three key vulnerability patterns. Building upon this insight, the authors propose a lightweight, lineage-graph-based governance layer that enables plug-and-play intervention without modifying the underlying architecture. Experimental results demonstrate that a single atomic error injection can trigger widespread failures across six mainstream frameworks, whereas the proposed method effectively halts error propagation in at least 89% of trials, significantly mitigating cascading failures.
๐Ÿ“ Abstract
Large Language Model-based Multi-Agent Systems (LLM-MAS) are increasingly applied to complex collaborative scenarios. However, their collaborative mechanisms may cause minor inaccuracies to gradually solidify into system-level false consensus through iteration. Such risks are difficult to trace since errors can propagate and amplify through message dependencies. Existing protections often rely on single-agent validation or require modifications to the collaboration architecture, which can weaken effective information flow and may not align with natural collaboration processes in real tasks. To address this, we propose a propagation dynamics model tailored for LLM-MAS that abstracts collaboration as a directed dependency graph and provides an early-stage risk criterion to characterize amplification risk. Through experiments on six mainstream frameworks, we identify three vulnerability classes: cascade amplification, topological sensitivity, and consensus inertia. We further instantiate an attack where injecting just a single atomic error seed leads to widespread failure. In response, we introduce a genealogy-graph-based governance layer, implemented as a message-layer plugin, that suppresses both endogenous and exogenous error amplification without altering the collaboration architecture. Experiments show that this approach raises the defense success rate from a baseline of 0.32 to over 0.89 and significantly mitigates the cascading spread of minor errors.
Problem

Research questions and friction points this paper is trying to address.

error cascades
LLM-based multi-agent systems
false consensus
error propagation
collaborative AI
Innovation

Methods, ideas, or system contributions that make the work stand out.

error cascade
LLM-based multi-agent systems
propagation dynamics
genealogy graph
consensus inertia
๐Ÿ”Ž Similar Papers
No similar papers found.