A Converse For the Capacity of the Shotgun Sequencing Channel with Erasures

📅 2025-09-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the information-theoretic capacity of shotgun sequencing channels for DNA storage under random symbol erasures. To address this erasure-prone random coverage channel, we propose a novel analytical framework based on a genie-aided decoder, integrating stochastic coverage modeling with noisy-channel coding theory to rigorously establish a converse (upper) bound on capacity. Although not fully tight in general, this bound asymptotically matches the best known achievable rate under typical operating regimes—particularly as coverage depth grows large. Consequently, it provides the first information-theoretic characterization of the fundamental performance limit of shotgun sequencing in the presence of erasures. This result establishes a critical theoretical benchmark for the design and optimization of DNA-based storage systems.

Technology Category

Application Category

📝 Abstract
The shotgun sequencing process involves fragmenting a long DNA sequence (input string) into numerous shorter, unordered, and overlapping segments (referred to as emph{reads}). The reads are sequenced, and later aligned to reconstruct the original string. Viewing the sequencing process as the read-phase of a DNA storage system, the information-theoretic capacity of noise-free shotgun sequencing has been characterized in literature. Motivated by the base-wise quality scores available in practical sequencers, a recent work considered the emph{shotgun sequencing channel with erasures}, in which the symbols in the reads are assumed to contain random erasures. Achievable rates for this channel were identified. In the present work, we obtain a converse for this channel. The arguments for the proof involve a careful analysis of a genie-aided decoder, which knows the correct locations of the reads. The converse is not tight in general. However, it meets the achievability result asymptotically in some channel parameters.
Problem

Research questions and friction points this paper is trying to address.

Characterizing the capacity of DNA shotgun sequencing channels with erasures
Establishing a converse bound for information rates in noisy sequencing
Analyzing genie-aided decoders with known read positions for proof
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed converse for shotgun sequencing channel with erasures
Analyzed genie-aided decoder with known read locations
Established asymptotic achievability under specific channel parameters
🔎 Similar Papers
No similar papers found.
M
Mohammed Ihsan Ali
Signal Processing and Communications Research Center, International Institute of Information Technology, Hyderabad 500032, India
H
Hrishi Narayanan
Institute for Communications Engineering, Technical University of Munich, Germany
Prasad Krishnan
Prasad Krishnan
International Institute of Information Technology, Hyderabad
Coding TheoryCoded CachingIndex CodingNetwork Coding