π€ AI Summary
Existing asynchronous multiparty session type (MPST) theories assume perfectly reliable communication, failing to model crash-stop failures prevalent in distributed systems. Method: This work introduces crash-handling branches into the asynchronous MPST framework without extending the syntax of global types, enabling a continuous reliability spectrumβfrom fully reliable to fully unreliable communication. We formalize crash semantics, establish a sound and complete correspondence between global and local types under crashes, and prove type preservation and progress. Contribution/Results: The framework guarantees deadlock-freedom, protocol conformance, and liveness for well-typed processes under arbitrary crash patterns. By bridging the gap between theoretical models and real-world fault-prone systems, this work provides a foundational basis for formal verification of fault-tolerant session protocols.
π Abstract
Session types provide a typing discipline for message-passing systems. However, their theory often assumes an ideal world: one in which everything is reliable and without failures. Yet this is in stark contrast with distributed systems in the real world. To address this limitation, we introduce a new asynchronous multiparty session types (MPST) theory with crash-stop failures, where processes may crash arbitrarily and cease to interact after crashing. We augment asynchronous MPST and processes with crash handling branches, and integrate crash-stop failure semantics into types and processes. Our approach requires no user-level syntax extensions for global types, and features a formalisation of global semantics, which captures complex behaviours induced by crashed/crash handling processes. Our new theory covers the entire spectrum, ranging from the ideal world of total reliability to entirely unreliable scenarios where any process may crash, using optional reliability assumptions. Under these assumptions, we demonstrate the sound and complete correspondence between global and local type semantics, which guarantee deadlock-freedom, protocol conformance, and liveness of well-typed processes by construction, even in the presence of crashes.