Video Compression with Hierarchical Temporal Neural Representation

📅 2026-01-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of effectively modeling complex temporal dependencies in existing implicit neural representation (INR)-based video compression methods. To this end, we propose TeNeRV, a hierarchical temporal neural representation framework that enhances local temporal consistency through an inter-frame feature fusion module and jointly captures both short- and long-term dependencies via a Group-of-Pictures (GoP)-adaptive modulation mechanism. TeNeRV implicitly models video content using continuous functions and dynamically adjusts its neural representation parameters according to the GoP structure. Experimental results demonstrate that TeNeRV achieves significantly superior rate-distortion performance compared to current INR-based video compression approaches.

Technology Category

Application Category

📝 Abstract
Video compression has recently benefited from implicit neural representations (INRs), which model videos as continuous functions. INRs offer compact storage and flexible reconstruction, providing a promising alternative to traditional codecs. However, most existing INR-based methods treat the temporal dimension as an independent input, limiting their ability to capture complex temporal dependencies. To address this, we propose a Hierarchical Temporal Neural Representation for Videos, TeNeRV. TeNeRV integrates short- and long-term dependencies through two key components. First, an Inter-Frame Feature Fusion (IFF) module aggregates features from adjacent frames, enforcing local temporal coherence and capturing fine-grained motion. Second, a GoP-Adaptive Modulation (GAM) mechanism partitions videos into Groups-of-Pictures and learns group-specific priors. The mechanism modulates network parameters, enabling adaptive representations across different GoPs. Extensive experiments demonstrate that TeNeRV consistently outperforms existing INR-based methods in rate-distortion performance, validating the effectiveness of our proposed approach.
Problem

Research questions and friction points this paper is trying to address.

video compression
implicit neural representations
temporal dependencies
neural video representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Temporal Representation
Inter-Frame Feature Fusion
GoP-Adaptive Modulation
Implicit Neural Representation
Video Compression
🔎 Similar Papers
No similar papers found.
J
Jun Zhu
University of Chinese Academy of Sciences
Xinfeng Zhang
Xinfeng Zhang
Fuxi AI Lab, NetEase Inc.
Vision-Language ModelsMultimodal
Lv Tang
Lv Tang
University of Alberta. Former researcher @ UCAS/Nanjing University
Computer VisionMLLMVideo CompressionImage Segmentation
J
Junhao Jiang
University of Chinese Academy of Sciences
G
Gai Zhang
University of Chinese Academy of Sciences
J
Jia Wang
University of Chinese Academy of Sciences