🤖 AI Summary
Microservice root cause analysis (RCA) has long suffered from the absence of standardized, large-scale, and reproducible evaluation benchmarks. To address this gap, we introduce MSRCA—the first open-source RCA benchmark specifically designed for microservice systems. MSRCA integrates 735 real-world failure incidents and constructs three comprehensive telemetry datasets—spanning traces, metrics, and logs—based on OpenTelemetry. It provides an end-to-end reproducible evaluation environment supporting both coarse-grained (e.g., service-level) and fine-grained (e.g., span- or dependency-level) RCA methods. The benchmark systematically incorporates 15 representative baseline approaches, including graph neural networks, causal inference models, and time-series anomaly detectors. MSRCA establishes a much-needed standard for fair, rigorous, and reproducible RCA evaluation, significantly enhancing cross-method comparability and research transparency. The benchmark is publicly released and has already been adopted by multiple industrial and academic teams, accelerating the standardization and practical deployment of RCA algorithm evaluation.
📝 Abstract
Root cause analysis (RCA) for microservice systems has gained significant attention in recent years. However, there is still no standard benchmark that includes large-scale datasets and supports comprehensive evaluation environments. In this paper, we introduce RCAEval, an open-source benchmark that provides datasets and an evaluation environment for RCA in microservice systems. First, we introduce three comprehensive datasets comprising 735 failure cases collected from three microservice systems, covering various fault types observed in real-world failures. Second, we present a comprehensive evaluation framework that includes fifteen reproducible baselines covering a wide range of RCA approaches, with the ability to evaluate both coarse-grained and fine-grained RCA. RCAEval is designed to support both researchers and practitioners. We hope that this ready-to-use benchmark will enable researchers and practitioners to conduct extensive analysis and pave the way for robust new solutions for RCA of microservice systems.