Coding Schemes for the Noisy Torn Paper Channel

๐Ÿ“… 2026-01-16
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

234K/year
๐Ÿค– AI Summary
This study addresses the challenge of reconstructing noisy, unordered DNA fragments degraded by sequence decay in DNA-based data storage by modeling the process as a torn-paper channel with substitution errors. The work proposes a novel approach that embeds either static or data-dependent hash markers into encoded sequencesโ€”a strategy systematically applied to this channel for the first timeโ€”to enable high-fidelity reconstruction. By integrating channel coding, marker design, hash functions, and reconstruction algorithms, and supported by probabilistic modeling and simulations, the study reveals a complementary performance trade-off between the two marker types under varying noise levels. Experimental results demonstrate reconstruction success rates exceeding 99% across multiple noise conditions, with zero decoding errors observed; performance is primarily constrained by computational resources rather than algorithmic limitations.

Technology Category

Application Category

๐Ÿ“ Abstract
To make DNA a suitable medium for archival data storage, it is essential to consider the decay process of the strands observed in DNA storage systems. This paper studies the decay process as a probabilistic noisy torn paper channel (TPC), which first corrupts the bits of the transmitted sequence in a probabilistic manner by substitutions, then breaks the sequence into a set of noisy unordered substrings. The present work devises coding schemes for the noisy TPC by embedding markers in the transmitted sequence. We investigate the use of static markers and markers connected to the data in the form of hash functions. These two tools have also been recently exploited to tackle the noiseless TPC. Simulations show that static markers excel at higher substitution probabilities, while data-dependent markers are superior at lower noise levels. Both approaches achieve reconstruction rates exceeding $99\%$ with no false decodings observed, primarily limited by computational resources.
Problem

Research questions and friction points this paper is trying to address.

DNA storage
noisy torn paper channel
substitution errors
unordered substrings
data decay
Innovation

Methods, ideas, or system contributions that make the work stand out.

noisy Torn Paper Channel
DNA data storage
marker-based coding
hash-based markers
sequence reconstruction
๐Ÿ”Ž Similar Papers
No similar papers found.