EV-NVC: Efficient Variable bitrate Neural Video Compression

📅 2025-11-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the suboptimal rate-distortion (RD) performance and limited contextual modeling capability of variable-bitrate neural video codecs (V-NVCs) at high bitrates, this paper proposes the EV-NVC framework. Methodologically, EV-NVC introduces three key innovations: (1) a piecewise-linear sampler (PLS) enabling fine-grained, differentiable bitrate adaptation; (2) a long-short-term feature fusion module (LSTFFM) to enhance spatiotemporal contextual modeling; and (3) a hybrid-precision, stage-wise collaborative training strategy to improve convergence stability and reconstruction fidelity under high compression. Experimental results under low-delay configurations demonstrate that EV-NVC achieves an average 30.56% BD-rate reduction over HM-16.25, significantly improving both reconstruction quality and RD performance—particularly in the high-bitrate regime.

Technology Category

Application Category

📝 Abstract
Training neural video codec (NVC) with variable rate is a highly challenging task due to its complex training strategies and model structure. In this paper, we train an efficient variable bitrate neural video codec (EV-NVC) with the piecewise linear sampler (PLS) to improve the rate-distortion performance in high bitrate range, and the long-short-term feature fusion module (LSTFFM) to enhance the context modeling. Besides, we introduce mixed-precision training and discuss the different training strategies for each stage in detail to fully evaluate its effectiveness. Experimental results show that our approach reduces the BD-rate by 30.56% compared to HM-16.25 within low-delay mode.
Problem

Research questions and friction points this paper is trying to address.

Improving variable bitrate neural video compression efficiency
Enhancing rate-distortion performance at high bitrates
Optimizing context modeling through feature fusion modules
Innovation

Methods, ideas, or system contributions that make the work stand out.

Piecewise linear sampler improves rate-distortion performance
Long-short-term feature fusion enhances context modeling
Mixed-precision training optimizes model efficiency
🔎 Similar Papers
No similar papers found.