From Pixels to Nucleotides: End-to-End Token-Based Video Compression for DNA Storage

📅 2026-04-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

261K/year
🤖 AI Summary
This work addresses the incompatibility between biochemical constraints and compression objectives in video DNA storage, which arises from conventional approaches that decouple compression and molecular encoding. To overcome this limitation, the authors propose HELIX, the first end-to-end jointly optimized framework integrating neural video compression with DNA encoding. HELIX unifies visual representation and quaternary ATCG encoding through semantic tokens and co-designs the encoding–decoding pipeline at the token level. The proposed TK-SCONE method leverages a Kronecker-structured mixture to disrupt spatial correlations and employs finite-state machine (FSM) mapping to satisfy biochemical constraints, enabling joint optimization of visual fidelity, mask prediction accuracy, and DNA synthesis efficiency. Experiments demonstrate a storage density of 1.91 bits per nucleotide, substantially outperforming pipeline-based baselines and establishing the feasibility of end-to-end learning for video DNA storage.

Technology Category

Application Category

📝 Abstract
DNA-based storage has emerged as a promising approach to the global data crisis, offering molecular-scale density and millennial-scale stability at low maintenance cost. Over the past decade, substantial progress has been made in storing text, images, and files in DNA -- yet video remains an open challenge. The difficulty is not merely technical: effective video DNA storage requires co-designing compression and molecular encoding from the ground up, a challenge that sits at the intersection of two fields that have largely evolved independently. In this work, we present HELIX, the first end-to-end neural network jointly optimizing video compression and DNA encoding -- prior approaches treat the two stages independently, leaving biochemical constraints and compression objectives fundamentally misaligned. Our key insight: token-based representations naturally align with DNA's quaternary alphabet -- discrete semantic units map directly to ATCG bases. We introduce TK-SCONE (Token-Kronecker Structured Constraint-Optimized Neural Encoding), which achieves 1.91 bits per nucleotide through Kronecker-structured mixing that breaks spatial correlations and FSM-based mapping that guarantees biochemical constraints. Unlike two-stage approaches, HELIX learns token distributions simultaneously optimized for visual quality, prediction under masking, and DNA synthesis efficiency. This work demonstrates for the first time that learned compression and molecular storage converge naturally at token representations -- suggesting a new paradigm where neural video codecs are designed for biological substrates from the ground up.
Problem

Research questions and friction points this paper is trying to address.

DNA storage
video compression
token-based representation
biochemical constraints
end-to-end optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

token-based representation
end-to-end optimization
DNA storage
neural video compression
biochemical constraints
🔎 Similar Papers