The Task Completion Problem and its Application to Crash-Resilient Computation

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

256K/year
🤖 AI Summary
This work addresses the problem of efficiently executing M abstract tasks in a congested-clique network where up to αn nodes may suffer crash failures. The paper proposes a deterministic fault-tolerant algorithm whose key innovation is the introduction of a novel combinatorial structure called a “load-balanced covering family.” This structure enables dynamic task assignment that guarantees, regardless of which nodes fail, balanced workload among surviving nodes and sufficient redundant executors for every task. The algorithm achieves a task-completion complexity of O(⌈M/n⌉ log n) rounds—nearly optimal—and reduces the fault-tolerant simulation overhead of any T-round algorithm to O(T² log n + T log² n), significantly improving upon prior results.
📝 Abstract
We study the Task Completion problem, in which $M$ abstract tasks must be completed by a network of $n$ crash-prone nodes, where up to $αn$ nodes may crash for some constant $α<1$. Our main result is a deterministic congested-clique algorithm that completes all $M$ tasks in $O(\lceil M/n\rceil \log n)$ rounds. This round complexity is optimal up to $\log\log n$ terms. The key technical ingredient underlying our algorithm is a novel combinatorial structure, which we call a \emph{load balancing covering family}. In essence, this covering family induces, for each task, a subset of nodes responsible for attempting to complete it. The properties of the load balancing covering family guarantee that, regardless of which tasks remain incomplete and which nodes crash, (i) no node is overloaded with incomplete tasks, and (ii) no task is left with too few potential assigned nodes. This yields a balanced per-node workload and prevents non-crashed nodes from being concentrated on a small subset of tasks, thereby ensuring sufficient progress in completing the remaining tasks. As an application of our task completion method, we give a deterministic algorithm for simulating any $T$-round congested-clique algorithm in the presence of up to $αn$ crash faults in $O(T^2 \log n + T \log^2 n)$ rounds. This improves upon a recent result by Censor-Hillel et al. (DISC~2025), which requires $T^2\cdot 2^{O(\sqrt{\log n}\log\log n)}$ rounds.
Problem

Research questions and friction points this paper is trying to address.

Task Completion
Crash-Resilient Computation
Congested Clique
Fault Tolerance
Distributed Algorithms
Innovation

Methods, ideas, or system contributions that make the work stand out.

load balancing covering family
task completion
crash-resilient computation
congested-clique
deterministic algorithm
🔎 Similar Papers