Resilience through Automated Adaptive Configuration for Distribution and Replication

📅 2025-06-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient resilience of complex systems under heterogeneous hardware environments, this paper proposes a fault-adaptive software deployment and redundancy configuration optimization method. We construct a system-level resilience state-space model and introduce a novel equivalence relation to enable quotient-space-based state-space reduction, significantly compressing the state space. Subsequently, we integrate formal model checking with strategy synthesis to automatically derive both an initial deployment configuration and dynamic reconfiguration policies that satisfy multi-level resilience requirements. Our key contributions are: (i) a new equivalence relation enabling efficient, semantics-preserving state-space reduction; and (ii) end-to-end automated synthesis of fault-response and recovery strategies. Experimental evaluation on an autonomous driving system model demonstrates that our approach substantially improves fault recovery latency and system availability, while supporting real-time resilience assurance.

Technology Category

Application Category

📝 Abstract
This paper presents a powerful automated framework for making complex systems resilient under failures, by optimized adaptive distribution and replication of interdependent software components across heterogeneous hardware components with widely varying capabilities. A configuration specifies how software is distributed and replicated: which software components to run on each computer, which software components to replicate, which replication protocols to use, etc. We present an algorithm that, given a system model and resilience requirements, (1) determines initial configurations of the system that are resilient, and (2) generates a reconfiguration policy that determines reconfiguration actions to execute in response to failures and recoveries. This model-finding algorithm is based on state-space exploration and incorporates powerful optimizations, including a quotient reduction based on a novel equivalence relation between states. We present experimental results from successfully applying a prototype implementation of our framework to a model of an autonomous driving system.
Problem

Research questions and friction points this paper is trying to address.

Automated framework for resilient complex systems under failures
Optimized adaptive distribution and replication of software components
Generates resilient initial configurations and reconfiguration policies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated adaptive distribution and replication framework
State-space exploration with quotient reduction
Resilient reconfiguration policy for failures
🔎 Similar Papers
No similar papers found.
S
S. Stoller
Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
B
Balaji Jayasankar
Department of Computer Science, Stony Brook University, Stony Brook, NY, USA
Yanhong A. Liu
Yanhong A. Liu
Stony Brook University, State University of New York
Languages and AlgorithmsDesign and Optimization