A Behavioral Theory for Distributed Systems with Weak Recovery

๐Ÿ“… 2024-06-18
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper addresses the challenge of rigorously verifying reliability in distributed systems subject to crash failures and weak recoveryโ€”where node identities remain distinguishable but no assumptions are made about initial recovery states. Methodologically, we propose the first fully abstract coinductive behavioral equivalence theory for such settings. We design a process model integrating Erlang semantics with the distributed ฯ€-calculus (Dฯ€F), explicitly supporting node failure and identity-aware recovery. We define a weak bisimulation semantics over a labeled transition system and prove its full abstraction with respect to contextual equivalence. Our contribution fills a foundational gap in formal equivalence theory for fault-tolerant systems under weak recovery, providing the first compositional and proof-carrying framework for behavioral equivalence of failure-recovery systems. This significantly enhances both the precision and practical applicability of reliability analysis for distributed protocols.

Technology Category

Application Category

๐Ÿ“ Abstract
Distributed systems can be subject to various kinds of partial failures, therefore building fault-tolerance or failure mitigation mechanisms for distributed systems remains an important domain of research. In this paper, we present a calculus to formally model distributed systems subject to crash failures with recovery. The recovery model considered in the paper is weak, in the sense that it makes no assumption on the exact state in which a failed node resumes its execution, only its identity has to be distinguishable from past incarnations of itself. Our calculus is inspired in part by the Erlang programming language and in part by the distributed $pi$-calculus with nodes and link failures (D$pi$F) introduced by Francalanza and Hennessy. In order to reason about distributed systems with failures and recovery we develop a behavioral theory for our calculus, in the form of a contextual equivalence, and of a fully abstract coinductive characterization of this equivalence by means of a labelled transition system semantics and its associated weak bisimilarity. This result is valuable for it provides a compositional proof technique for proving or disproving contextual equivalence between systems.
Problem

Research questions and friction points this paper is trying to address.

Distributed Systems
Fault Tolerance
Functional Equivalence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Distributed System Recovery
System Function Equivalence
Erlang Programming
๐Ÿ”Ž Similar Papers
No similar papers found.
G
G. Fabbretti
Univ. Grenoble Alpes, INRIA, CNRS, Grenoble INP, LIG, 38000 Grenoble, France
Ivan Lanese
Ivan Lanese
University of Bologna
TheoryProgrammingSoftware Engineering
J
J. Stefani
Univ. Grenoble Alpes, INRIA, CNRS, Grenoble INP, LIG, 38000 Grenoble, France