NeuralLVC: Neural Lossless Video Compression via Masked Diffusion with Temporal Conditioning

📅 2026-04-03

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

This work addresses the challenge of efficiently exploiting temporal redundancy in lossless video compression with neural networks. It proposes a novel framework that integrates masked diffusion models with a conventional I/P-frame architecture, introducing a bidirectional invertible linear tokenization scheme for I-frames and a lightweight reference embedding mechanism for P-frames that increases model parameters by only 1.3% to capture temporal differences. The design enables grouped parallel decoding while supporting end-to-end pixel-accurate reconstruction in both YUV420 and RGB color spaces. Evaluated on nine Xiph CIF sequences, the method significantly outperforms the lossless modes of H.264 and H.265 in terms of compression ratio while maintaining high decoding efficiency.

Technology Category

Application Category

📝 Abstract

While neural lossless image compression has advanced significantly with learned entropy models, lossless video compression remains largely unexplored in the neural setting. We present NeuralLVC, a neural lossless video codec that combines masked diffusion with an I/P-frame architecture for exploiting temporal redundancy. Our I-frame model compresses individual frames using bijective linear tokenization that guarantees exact pixel reconstruction. The P-frame model compresses temporal differences between consecutive frames, conditioned on the previous decoded frame via a lightweight reference embedding that adds only 1.3% trainable parameters. Group-wise decoding enables controllable speed-compression trade-offs. Our codec is lossless in the input domain: for video, it reconstructs YUV420 planes exactly; for image evaluation, RGB channels are reconstructed exactly. Experiments on 9 Xiph CIF sequences show that NeuralLVC outperforms H.264 and H.265 lossless by a significant margin. We verify exact reconstruction through end-to-end encode-decode testing with arithmetic coding. These results suggest that masked diffusion with temporal conditioning is a promising direction for neural lossless video compression.

Problem

Research questions and friction points this paper is trying to address.

neural lossless video compression

temporal redundancy

exact reconstruction

masked diffusion

video codec

Innovation

Methods, ideas, or system contributions that make the work stand out.

masked diffusion

lossless video compression

temporal conditioning