Tree-NeRV: A Tree-Structured Neural Representation for Efficient Non-Uniform Video Encoding

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Existing neural video coding methods employ uniform temporal sampling, which struggles to model temporal redundancy effectively, thereby limiting rate-distortion performance. To address this, we propose a tree-structured implicit representation framework: (i) we introduce a binary search tree (BST) to hierarchically organize temporal features, enabling adaptive non-uniform sampling; (ii) we design a motion-complexity-driven dynamic sampling strategy; and (iii) we incorporate gradient-guided temporal importance assessment with differentiable optimization. Built upon the NeRV architecture, our method significantly improves compression efficiency and reconstruction quality. Extensive experiments demonstrate average PSNR gains of 1.2–2.8 dB and bitrate reductions of 37%–52% across multiple benchmarks, consistently outperforming state-of-the-art uniformly sampled neural video codecs.

Technology Category

Application Category

📝 Abstract

Implicit Neural Representations for Videos (NeRV) have emerged as a powerful paradigm for video representation, enabling direct mappings from frame indices to video frames. However, existing NeRV-based methods do not fully exploit temporal redundancy, as they rely on uniform sampling along the temporal axis, leading to suboptimal rate-distortion (RD) performance. To address this limitation, we propose Tree-NeRV, a novel tree-structured feature representation for efficient and adaptive video encoding. Unlike conventional approaches, Tree-NeRV organizes feature representations within a Binary Search Tree (BST), enabling non-uniform sampling along the temporal axis. Additionally, we introduce an optimization-driven sampling strategy, dynamically allocating higher sampling density to regions with greater temporal variation. Extensive experiments demonstrate that Tree-NeRV achieves superior compression efficiency and reconstruction quality, outperforming prior uniform sampling-based methods. Code will be released.

Problem

Research questions and friction points this paper is trying to address.

Addresses suboptimal video compression in uniform temporal sampling

Proposes tree-structured representation for adaptive video encoding

Improves compression efficiency by optimizing temporal sampling density

Innovation

Methods, ideas, or system contributions that make the work stand out.

Tree-structured feature representation for video encoding

Binary Search Tree enables non-uniform temporal sampling

Optimization-driven sampling for dynamic density allocation

🔎 Similar Papers

VideoPrism: A Foundational Visual Encoder for Video Understanding