🤖 AI Summary
This work investigates the design of error-correcting codes capable of uniquely reconstructing an original sequence from a constant number (e.g., 5, 9, 11, or 14) of noisy reads in the presence of a single deletion and a single substitution error. By analyzing the intersection structure of error balls induced by such errors and leveraging combinatorial coding theory together with logarithmic-scale parity checks, the authors achieve significantly reduced redundancy without requiring the number of reads to scale with the sequence length. Specifically, for five reads, the redundancy is lowered to \(3\log n + 4\); for nine and eleven reads, it becomes \(2\log n + 12\log\log n + O(1)\) and \(\log n + 12\log\log n + O(1)\), respectively; and with fourteen reads, only \(\log n + 3\) bits of redundancy suffice—approaching the theoretical lower bound.
📝 Abstract
In this paper, we investigate the problem of designing $(n, N; \mathcal{B})$-reconstruction codes for $N\in \{14,11,9,5\}$, where $\mathcal{B}$ is the single-deletion single-substitution ball function that maps a sequence to the set of all sequences obtainable via one deletion and one substitution. Such a code is defined by the requirement that the intersection size of any two distinct single-deletion single-substitution balls is strictly less than the given number of noisy reads $N$. Note that for any $1\le N<N'$, an $(n, N; \mathcal{B})$-reconstruction code is also an $(n, N'; \mathcal{B})$-reconstruction code. It follows that the problem of designing $(n, N; \mathcal{B})$-reconstruction codes with less redundancy becomes more challenging as $N$ decreases, particularly because the problem for $N=1$ already reduces to the coding problem of single-deletion and single-substitution correcting codes. To the best of our knowledge, most existing results focus on the case where $N$ is a linear function of $n$, while only a limited number consider constant $N$. When $N=1$, the best known $(n, 1; \mathcal{B})$-reconstruction codes (single-deletion and single-substitution correcting codes) require $(4+o(1))\log n$ redundant bits. In this work, we show that this redundancy can be reduced to $3\log n+4$ when $N=5$. As $N$ increases further to $9$ and $11$, the redundancy can be improved to $2\log n+12\log\log n+O(1)$ and $\log n +12\log \log n+O(1)$, respectively. Finally, for $N=14$, we provide a reconstruction code with $\log n+3$ bits of redundancy, which is only two bits more than the best known $(n, 18; \mathcal{B})$-reconstruction codes.