Model-Based Diagnosis: Automating End-to-End Diagnosis of Network Failures

📅 2025-06-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Rapid root-cause diagnosis of enterprise network failures has long been hindered by fragmented diagnostic scope—limited to either the data or control plane—and heavy reliance on manual analysis. Method: This paper proposes a model-based network diagnosis paradigm that, for the first time, systematically unifies hardware, firmware, and software-layer faults, jointly modeling both the data plane and distributed control plane. Leveraging network verification techniques, it constructs end-to-end forwarding and routing models, enabling automated root-cause inference from user-level symptoms within P4 switch and distributed routing software simulation environments. Contribution/Results: Evaluation demonstrates 100% diagnostic accuracy in simulation. On 33 real-world failure cases from a major cloud provider, the approach localizes root causes for 30 cases within seconds—achieving over two orders-of-magnitude speedup versus manual analysis—and significantly advances automation in enterprise network operations.

Technology Category

Application Category

📝 Abstract
Fast diagnosis and repair of enterprise network failures is critically important since disruptions cause major business impacts. Prior works focused on diagnosis primitives or procedures limited to a subset of the problem, such as only data plane or only control plane faults. This paper proposes a new paradigm, model-based network diagnosis, that provides a systematic way to derive automated procedures for identifying the root cause of network failures, based on reports of end-to-end user-level symptoms. The diagnosis procedures are systematically derived from a model of packet forwarding and routing, covering hardware, firmware, and software faults in both the data plane and distributed control plane. These automated procedures replace and dramatically accelerate diagnosis by an experienced human operator. Model-based diagnosis is inspired by, leverages, and is complementary to recent work on network verification. We have built NetDx, a proof-of-concept implementation of model-based network diagnosis. We deployed NetDx on a new emulator of networks consisting of P4 switches with distributed routing software. We validated the robustness and coverage of NetDx with an automated fault injection campaign, in which 100% of faults were diagnosed correctly. Furthermore, on a data set of 33 faults from a large cloud provider that are within the domain targeted by NetDx, 30 are efficiently diagnosed in seconds instead of hours.
Problem

Research questions and friction points this paper is trying to address.

Automates end-to-end diagnosis of network failures
Covers data plane and control plane faults systematically
Accelerates diagnosis compared to human operators
Innovation

Methods, ideas, or system contributions that make the work stand out.

Model-based network diagnosis for automated root cause identification
Systematic derivation from packet forwarding and routing models
Proof-of-concept NetDx implementation with high accuracy
🔎 Similar Papers
No similar papers found.