SCALED : Surrogate-gradient for Codec-Aware Learning of Downsampling in ABR Streaming

📅 2026-01-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the suboptimal end-to-end rate-distortion performance in traditional adaptive bitrate (ABR) video streaming, where downsampling, encoding, and upsampling are optimized independently, and standard non-differentiable codecs hinder joint training. To overcome this limitation, the authors propose a data-driven surrogate gradient framework that, for the first time, enables end-to-end training by directly leveraging the compression error from real, non-differentiable codecs to construct gradients—without relying on differentiable proxy models. This approach facilitates codec-aware downsampling optimization, aligning the training objective with actual deployment performance. Experimental results demonstrate a 5.19% BD-BR (PSNR) improvement over codec-agnostic methods across the full rate-distortion convex hull.

Technology Category

Application Category

📝 Abstract
The rapid growth in video consumption has introduced significant challenges to modern streaming architectures. Over-the-Top (OTT) video delivery now predominantly relies on Adaptive Bitrate (ABR) streaming, which dynamically adjusts bitrate and resolution based on client-side constraints such as display capabilities and network bandwidth. This pipeline typically involves downsampling the original high-resolution content, encoding and transmitting it, followed by decoding and upsampling on the client side. Traditionally, these processing stages have been optimized in isolation, leading to suboptimal end-to-end rate-distortion (R-D) performance. The advent of deep learning has spurred interest in jointly optimizing the ABR pipeline using learned resampling methods. However, training such systems end-to-end remains challenging due to the non-differentiable nature of standard video codecs, which obstructs gradient-based optimization. Recent works have addressed this issue using differentiable proxy models, based either on deep neural networks or hybrid coding schemes with differentiable components such as soft quantization, to approximate the codec behavior. While differentiable proxy codecs have enabled progress in compression-aware learning, they remain approximations that may not fully capture the behavior of standard, non-differentiable codecs. To our knowledge, there is no prior evidence demonstrating the inefficiencies of using standard codecs during training. In this work, we introduce a novel framework that enables end-to-end training with real, non-differentiable codecs by leveraging data-driven surrogate gradients derived from actual compression errors. It facilitates the alignment between training objectives and deployment performance. Experimental results show a 5.19\% improvement in BD-BR (PSNR) compared to codec-agnostic training approaches, consistently across the entire rate-distortion convex hull spanning multiple downsampling ratios.
Problem

Research questions and friction points this paper is trying to address.

Adaptive Bitrate Streaming
Rate-Distortion Optimization
Non-differentiable Codecs
End-to-End Training
Downsampling
Innovation

Methods, ideas, or system contributions that make the work stand out.

surrogate gradient
codec-aware learning
adaptive bitrate streaming
end-to-end optimization
non-differentiable codec
🔎 Similar Papers
No similar papers found.
E
Esteban Pesnel
MediaKind, Rennes, France
J
Julien Le Tanou
MediaKind, Rennes, France
M
Michael Ropert
MediaKind, Rennes, France
Thomas Maugey
Thomas Maugey
Senior Researcher at Inria
Image/video compressionVisual data representationGraph Signal ProcessingDigital sobriety
A
Aline Roumy
INRIA, Rennes, France