OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data

📅 2026-02-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work proposes OmniZip, the first lightweight unified framework for lossless compression of multimodal data, addressing the limitation of existing learning-based methods that are typically confined to a single modality and ill-suited for resource-constrained edge devices. OmniZip achieves high compression efficiency across diverse modalities—including images, text, speech, tactile signals, database tables, and genomic sequences—through a modality-agnostic invertible tokenizer, a routing-based contextual modeling and feedforward mechanism, and a reparameterization-aware training strategy, all operating under minimal computational overhead. Experimental results demonstrate that OmniZip outperforms gzip by 42%–62% in compression ratio across multiple benchmarks such as CLIC-M, TouchandGo, enwik9, LibriSpeech, and WikiSQL, while enabling near real-time inference on commodity hardware, including MacBook CPUs and iPhone NPUs, at approximately 1 MB/s.

Technology Category

Application Category

📝 Abstract

Lossless compression is essential for efficient data storage and transmission. Although learning-based lossless compressors achieve strong results, most of them are designed for a single modality, leading to redundant compressor deployments in multi-modal settings. Designing a unified multi-modal compressor is critical yet challenging, as different data types vary largely in format, dimension, and statistics. Multi-modal large language models offer a promising resolution but remain too complex for practical use. Thus, we propose \textbf{OmniZip}, \textbf{a unified and lightweight lossless compressor for multi-modal data (like image, text, speech, tactile, database, and gene sequence)}. Built on a lightweight backbone, OmniZip incorporates three key components to enable efficient multi-modal lossless compression: a modality-unified tokenizer that reversibly transforms diverse data into tokens, a modality-routing context learning mechanism that enables flexible multi-modal context modeling, and a modality-routing feedforward design that further enhances the model's nonlinear representation flexibility. A reparameterization training strategy is used to enhance model capacity. OmniZip outperforms or matches other state-of-the-art compressors on multiple modalities, achieving 42\%, 57\%, 62\% and 42\%, 53\% higher compression efficiency than gzip on CLIC-M, TouchandGo, enwik9, LibriSpeech, and WikiSQL datasets, respectively. It also supports near real-time inference on resource-constrained edge devices, reaching about 1MB/s on MacBook CPUs and iPhone NPUs. Our code is released at https://github.com/adminasmi/OmniZip-CVPR2026.

Problem

Research questions and friction points this paper is trying to address.

lossless compression

multi-modal data

unified compressor

lightweight model

data redundancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

unified lossless compression

multi-modal data

modality-unified tokenizer

modality-routing context learning

lightweight compressor

🔎 Similar Papers

No similar papers found.

Authors to Follow