🤖 AI Summary
This work addresses the problem of efficiently encoding a binary string to enable low-communication synchronization with another string that differs by multiple substrings, each of bounded length. The authors propose a novel approach combining combinatorial coding constructions with probabilistic analysis, integrated with string synchronization techniques and hash-based verification. This method achieves near-optimal encoding length under the multi-substring edit model while maintaining low computational complexity. Specifically, the worst-case encoding length is $4t \log n + o(\log n)$ bits, and under a uniform distribution assumption, the expected length improves to $(4t - 1) \log n + o(\log n)$ bits, outperforming existing high-complexity schemes.
📝 Abstract
We study the document exchange problem under multiple substring edits. A substring edit in a string $\mathbf{x}$ occurs when a substring $\mathbf{u}$ of $\mathbf{x}$ is replaced by an arbitrary string $\mathbf{v}$. The lengths of $\mathbf{u}$ and $\mathbf{v}$ are bounded from above by a fixed constant. Let $\mathbf{x}$ and $\mathbf{y}$ be two binary strings that differ by multiple substring edits. The aim of document exchange schemes is to construct an encoding of $\mathbf{x}$ with small length such that $\mathbf{x}$ can be recovered using $\mathbf{y}$ and the encoding. We construct a low-complexity document exchange scheme with encoding length of $4t\log n+o(\log n)$ bits, where $n$ is the length of the string $\mathbf{x}$. The best known scheme achieves an encoding length of $4t \log n+O(\log\log n)$ bits, but at a much higher computational complexity. Then, we investigate the average length of valid encodings for document exchange schemes with uniform strings $\mathbf{x}$ and develop a scheme with an expected encoding length of $(4t-1) \log n+o(\log n)$ bits. In this setting, prior works have only constructed schemes for a single substring edit.