Resilient and Reliable Cloud Network Control for Mission-Critical Latency-Sensitive Service Chains

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Ensuring joint reliability and elasticity for mission-critical, low-latency service chains remains challenging due to conflicting requirements between sustained dependable operation and rapid post-failure recovery. Method: This paper proposes a unified stochastic network control framework that jointly models nominal reliability and post-failure recovery agility. It formulates reliability and elasticity as a stochastic optimization problem subject to both long-term (availability) and short-term (recovery time) latency constraints, and designs the Multi-traffic-flow Elastic Reliable Cloud Network Control (MC-ResRCNC) algorithm for dynamic resource orchestration across temporal scales. Contribution/Results: Experiments demonstrate that MC-ResRCNC significantly improves timely throughput under normal operation and reduces end-to-end recovery time by 58% under node or link failures. It outperforms state-of-the-art approaches in both reliability and network resilience. This work establishes a verifiable, deployable elastic control paradigm for latency-sensitive service chains.

Technology Category

Application Category

📝 Abstract
The proliferation of mission-critical latency-sensitive services has intensified the demand for next-generation cloud-integrated networks to guarantee both reliable and resilient service delivery. While reliability imposes timely-throughput requirements, i.e., percentage of packets to be delivered within a prescribed per-packet deadline, resilience relates to the network's ability to swiftly recover timely-throughput performance following an outage event, such as node or link failures. While recent studies have increasingly focused on designing reliable network control policies, a comprehensive solution that combines reliable and resilient network control has yet to be fully explored. This paper formulates the multi-commodity least-cost resilient and reliable network control (MC-LC-ResRNC) problem as a stochastic control problem with long and short-term timely throughput constraints. We then present a solution through the Multi-Commodity Resilient and Reliable Cloud Network Control (MC-ResRCNC) algorithm and show through numerical experiments that it jointly ensures reliability under normal conditions and resilience upon network failure.
Problem

Research questions and friction points this paper is trying to address.

Ensuring reliable and resilient cloud network control for latency-sensitive services
Addressing timely-throughput requirements and swift recovery from network failures
Developing a comprehensive solution combining reliability and resilience in network control
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-commodity algorithm for resilient and reliable cloud network control
Stochastic control with long and short-term timely throughput constraints
Ensures reliability under normal conditions and resilience upon network failure
🔎 Similar Papers
No similar papers found.