Enhancing Neural Video Compression of Static Scenes with Positive-Incentive Noise

๐Ÿ“… 2026-03-06
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the limitations of existing video compression methods, which struggle to efficiently exploit temporal redundancy in static scenes and are sensitive to train-test distribution shifts, while generative approaches often introduce spurious details. To overcome these issues, the authors propose a novel paradigm that models short-term temporal variations as forward excitation noise within a neural video compression framework. This approach explicitly decouples transient dynamics from the static background, enabling the model to internalize structural priors. During inference, only the invariant background needs to be transmitted at an extremely low bitrate, achieving a favorable trade-off between fidelity and compression efficiency. By introducing forward excitation noise into static-scene compression for the first time, the method attains a 73% Bjรธntegaard delta rate saving over general neural video compression, thereby supporting robust transmission under poor network conditions and cost-effective long-term storage of surveillance video.

Technology Category

Application Category

๐Ÿ“ Abstract
Static scene videos, such as surveillance feeds and videotelephony streams, constitute a dominant share of storage consumption and network traffic. However, both traditional standardized codecs and neural video compression (NVC) methods struggle to encode these videos efficiently due to inadequate usage of temporal redundancy and severe distribution gaps between training and test data, respectively. While recent generative compression methods improve perceptual quality, they introduce hallucinated details that are unacceptable in authenticity-critical applications. To overcome these limitations, we propose to incorporate positive-incentive noise into NVC for static scene videos, where short-term temporal changes are reinterpreted as positive-incentive noise to facilitate model finetuning. By disentangling transient variations from the persistent background, structured prior information is internalized in the compression model. During inference, the invariant component requires minimal signaling, thus reducing data transmission while maintaining pixel-level fidelity. Preliminary experiments demonstrate a 73% Bj{\o}ntegaard delta (BD) rate saving compared to general NVC models. Our method provides an effective solution to trade computation for bandwidth, enabling robust video transmission under adverse network conditions and economic long-term retention of surveillance footage.
Problem

Research questions and friction points this paper is trying to address.

neural video compression
static scenes
temporal redundancy
distribution gap
authenticity-critical applications
Innovation

Methods, ideas, or system contributions that make the work stand out.

neural video compression
positive-incentive noise
static scene video
temporal redundancy
perceptual fidelity
๐Ÿ”Ž Similar Papers
No similar papers found.
Cheng Yuan
Cheng Yuan
Associate Professor, School of Mathematics and Statistics, Central China Normal University
Computational PhysicsDeep Learning
Z
Zhenyu Jia
Institute of Artificial Intelligence (TeleAI), China Telecom
J
Jiawei Shao
Institute of Artificial Intelligence (TeleAI), China Telecom
X
Xuelong Li
Institute of Artificial Intelligence (TeleAI), China Telecom