Graph Neural AI with Temporal Dynamics for Comprehensive Anomaly Detection in Microservices

📅 2025-11-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of anomaly detection and root-cause localization in microservice architectures, this paper proposes a joint structural-temporal modeling framework. It represents service dependencies as a directed graph and introduces a dynamic topology-aware multi-layer stacked graph convolutional network coupled with gated temporal units to compute dual-granularity (node-level and path-level) anomaly scores. A multi-dimensional feature aggregation mechanism and a structure-temporal co-representation module enable interpretable reconstruction and precise identification of anomaly propagation paths. Extensive experiments on multiple real-world microservice datasets demonstrate that the method significantly outperforms state-of-the-art baselines in key metrics—including AUC and F1-score—while exhibiting strong robustness and generalization capability, particularly under dynamic topologies and high-noise conditions.

Technology Category

Application Category

📝 Abstract
This study addresses the problem of anomaly detection and root cause tracing in microservice architectures and proposes a unified framework that combines graph neural networks with temporal modeling. The microservice call chain is abstracted as a directed graph, where multidimensional features of nodes and edges are used to construct a service topology representation, and graph convolution is applied to aggregate features across nodes and model dependencies, capturing complex structural relationships among services. On this basis, gated recurrent units are introduced to model the temporal evolution of call chains, and multi-layer stacking and concatenation operations are used to jointly obtain structural and temporal representations, improving the ability to identify anomaly patterns. Furthermore, anomaly scoring functions at both the node and path levels are defined to achieve unified modeling from local anomaly detection to global call chain tracing, which enables the identification of abnormal service nodes and the reconstruction of potential anomaly propagation paths. Sensitivity experiments are then designed from multiple dimensions, including hyperparameters, environmental disturbances, and data distribution, to evaluate the framework, and results show that it outperforms baseline methods in key metrics such as AUC, ACC, Recall, and F1-Score, maintaining high accuracy and stability under dynamic topologies and complex environments. This research not only provides a new technical path for anomaly detection in microservices but also lays a methodological foundation for intelligent operations in distributed systems.
Problem

Research questions and friction points this paper is trying to address.

Detecting anomalies in microservice architectures using temporal graph networks
Identifying root causes by modeling service dependencies and call chains
Improving accuracy and stability in dynamic distributed system environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph neural networks model microservice structural dependencies
Gated recurrent units capture temporal call chain evolution
Node and path level scoring enables unified anomaly tracing
🔎 Similar Papers
No similar papers found.
Q
Qingyuan Zhang
Boston University, Boston, USA
N
Ning Lyu
Carnegie Mellon University, Pittsburgh, USA
Le Liu
Le Liu
Northwestern Polytechnical University
VisualizationComputer GraphicsComputer VisionAI
Y
Yuxiao Wang
Hofstra University, Hempstead, USA
Z
Ziyu Cheng
C
Cancan Hua
University of Southern California, Los Angeles, USA