🤖 AI Summary
Root cause analysis (RCA) in telecom networks faces three key challenges: complex graph-structured dependencies among alarms, lack of large-scale real-world benchmarks, and insufficient interpretability of causal reasoning. To address these, this paper proposes the first autonomous, iterative multi-agent framework tailored for telecom alarm RCA. It integrates graph neural networks to model alarm propagation dynamics, reinforcement learning for sequential decision optimization, and collaborative multi-agent causal inference. We further construct and open-source the first large-scale, real-world RCA benchmark—comprising multi-vendor equipment, heterogeneous network topologies, and diverse fault patterns. Evaluated on production network data, our method improves root cause localization accuracy by 23.6% and reduces average response time by 41%, while demonstrating superior generalization and scalability across complex topologies. Our core innovations include pioneering the self-improving agent paradigm for telecom RCA and establishing a standardized evaluation framework for reproducible, rigorous assessment.
📝 Abstract
Root Cause Analysis (RCA) in telecommunication networks is a critical task, yet it presents a formidable challenge for Artificial Intelligence (AI) due to its complex, graph-based reasoning requirements and the scarcity of realistic benchmarks.