Neural B-frame Video Compression with Bi-directional Reference Harmonization

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In neural B-frame video compression (NBVC), imbalanced contributions from bidirectional reference frames—particularly under large frame intervals—hinder compression efficiency. To address this, we propose a Bidirectional Reference Coordination framework that jointly optimizes motion and contextual information to enhance modeling accuracy. Our key contributions are: (1) a Bidirectional Motion Convergence (BMC) module that aligns and refines motion estimation across large frame gaps; (2) a Bidirectional Context Fusion (BCF) module that dynamically fuses context from both reference frames at the decoder via optical-flow-guided motion compression and adaptive weighting; and (3) an accuracy-driven reference weighting mechanism that adaptively modulates each reference frame’s contribution. Evaluated on the HEVC standard test sequences under random-access configuration, our method significantly outperforms existing neural video compression approaches and surpasses VTM-RA, achieving state-of-the-art rate-distortion performance.

Technology Category

Application Category

📝 Abstract
Neural video compression (NVC) has made significant progress in recent years, while neural B-frame video compression (NBVC) remains underexplored compared to P-frame compression. NBVC can adopt bi-directional reference frames for better compression performance. However, NBVC's hierarchical coding may complicate continuous temporal prediction, especially at some hierarchical levels with a large frame span, which could cause the contribution of the two reference frames to be unbalanced. To optimize reference information utilization, we propose a novel NBVC method, termed Bi-directional Reference Harmonization Video Compression (BRHVC), with the proposed Bi-directional Motion Converge (BMC) and Bi-directional Contextual Fusion (BCF). BMC converges multiple optical flows in motion compression, leading to more accurate motion compensation on a larger scale. Then BCF explicitly models the weights of reference contexts under the guidance of motion compensation accuracy. With more efficient motions and contexts, BRHVC can effectively harmonize bi-directional references. Experimental results indicate that our BRHVC outperforms previous state-of-the-art NVC methods, even surpassing the traditional coding, VTM-RA (under random access configuration), on the HEVC datasets. The source code is released at https://github.com/kwai/NVC.
Problem

Research questions and friction points this paper is trying to address.

Optimizing bi-directional reference frame utilization in neural video compression
Addressing unbalanced reference frame contributions in hierarchical B-frame coding
Improving motion compensation accuracy across large temporal frame spans
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-directional Motion Converge for optical flow compression
Bi-directional Contextual Fusion models reference weights
Harmonizes bi-directional references for video compression
🔎 Similar Papers
No similar papers found.
Yuxi Liu
Yuxi Liu
University of California, Berkeley
general relativityquantum mechanicsneural network
D
Dengchao Jin
Kuaishou Technology, Beijing, China
Shuai Huo
Shuai Huo
Kuaishou Technology, Beijing, China
Jiawen Gu
Jiawen Gu
Kuaishou Technology, Beijing, China
C
Chao Zhou
Kuaishou Technology, Beijing, China
H
Huihui Bai
Beijing Jiaotong University, Beijing, China
M
Ming Lu
Nanjing University, Nanjing, China
Zhan Ma
Zhan Ma
Vision Lab, Nanjing University
Learning for Video Coding & CommunicationComputational Imaging