TeCoNeRV: Leveraging Temporal Coherence for Compressible Neural Representations for Videos

📅 2026-02-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing hypernetwork-based implicit neural representations for video, which suffer from low compression efficiency, high memory consumption, and poor reconstruction quality at high resolutions. To overcome these challenges, we propose an efficient neural video representation framework that integrates a spatiotemporal block-wise weight prediction mechanism, a residual bitstream encoding strategy, and a temporal consistency regularization. Our method significantly improves both compression efficiency and reconstruction fidelity, achieving PSNR gains of 2.47 dB and 5.35 dB at 480p and 720p resolutions, respectively, on the UVG dataset, while reducing bitrate by 36% and accelerating encoding by 1.5–3×. Notably, this is the first neural video compression approach to demonstrate efficient performance at 1080p resolution.

Technology Category

Application Category

📝 Abstract
Implicit Neural Representations (INRs) have recently demonstrated impressive performance for video compression. However, since a separate INR must be overfit for each video, scaling to high-resolution videos while maintaining encoding efficiency remains a significant challenge. Hypernetwork-based approaches predict INR weights (hyponetworks) for unseen videos at high speeds, but with low quality, large compressed size, and prohibitive memory needs at higher resolutions. We address these fundamental limitations through three key contributions: (1) an approach that decomposes the weight prediction task spatially and temporally, by breaking short video segments into patch tubelets, to reduce the pretraining memory overhead by 20$\times$; (2) a residual-based storage scheme that captures only differences between consecutive segment representations, significantly reducing bitstream size; and (3) a temporal coherence regularization framework that encourages changes in the weight space to be correlated with video content. Our proposed method, TeCoNeRV, achieves substantial improvements of 2.47dB and 5.35dB PSNR over the baseline at 480p and 720p on UVG, with 36% lower bitrates and 1.5-3$\times$ faster encoding speeds. With our low memory usage, we are the first hypernetwork approach to demonstrate results at 480p, 720p and 1080p on UVG, HEVC and MCL-JCV. Our project page is available at https://namithap10.github.io/teconerv/ .
Problem

Research questions and friction points this paper is trying to address.

video compression
implicit neural representations
hypernetwork
temporal coherence
high-resolution video
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal Coherence
Implicit Neural Representation
Hypernetwork
Video Compression
Residual Storage
🔎 Similar Papers
No similar papers found.