🤖 AI Summary
In multi-agent systems, errors propagate erroneously across agents during multi-round deep search, hindering root-cause localization. Method: This paper proposes an information-flow–based fault attribution paradigm. Its core innovations are: (1) constructing an Information Dependency Graph (IDG) to explicitly model cross-agent output referencing—replacing temporal sequences and enabling precise discrimination between symptoms and root causes; (2) designing a graph-structure–aware synthetic data generation method to simulate failures at critical nodes; and (3) developing a graph-traversal–based root-cause tracing algorithm, integrated into the multi-agent framework for automated diagnosis. Results: On the Who&When benchmark, the approach achieves up to 18.18% improvement in attribution accuracy; in real-world deployment, it yields performance gains of 4.8%–14.2%.
📝 Abstract
Multi-agent systems powered by Large Language Models excel at complex tasks through coordinated collaboration, yet they face high failure rates in multi-turn deep search scenarios. Existing temporal attribution methods struggle to accurately diagnose root causes, particularly when errors propagate across multiple agents. Attempts to automate failure attribution by analyzing action sequences remain ineffective due to their inability to account for information dependencies that span agents. This paper identifies two core challenges: extit{(i) distinguishing symptoms from root causes in multi-agent error propagation}, and extit{(ii) tracing information dependencies beyond temporal order}. To address these issues, we introduce extbf{GraphTracer}, a framework that redefines failure attribution through information flow analysis. GraphTracer constructs Information Dependency Graphs (IDGs) to explicitly capture how agents reference and build on prior outputs. It localizes root causes by tracing through these dependency structures instead of relying on temporal sequences. GraphTracer also uses graph-aware synthetic data generation to target critical nodes, creating realistic failure scenarios. Evaluations on the Who&When benchmark and integration into production systems demonstrate that GraphTracer-8B achieves up to 18.18% higher attribution accuracy compared to state-of-the-art models and enables 4.8% to 14.2% performance improvements in deployed multi-agent frameworks, establishing a robust solution for multi-agent system debugging.