Semantic-Aware Adaptive Video Streaming Using Latent Diffusion Models for Wireless Networks

📅 2025-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address bandwidth waste, high storage overhead, and degraded Quality of Experience (QoE) in conventional CBR/ABR streaming over wireless networks, this paper proposes the first semantic-aware adaptive video streaming framework integrating Latent Diffusion Models (LDMs). We design a customized encoding pipeline within FFmpeg: LDM-based semantic compression is applied exclusively to I-frames, while B/P-frames are lightweighted into reconstruction metadata. Advanced denoising and Video Frame Interpolation (VFI) techniques are jointly incorporated to ensure semantic consistency and temporal coherence under noisy channel conditions. Experimental results on typical 5G channels demonstrate that our method achieves a 37% bandwidth reduction and 52% storage overhead reduction compared to state-of-the-art approaches, with significant improvements in PSNR and SSIM. End-to-end latency remains at the millisecond level, enabling joint optimization of rate-distortion performance and user-perceived quality.

Technology Category

Application Category

📝 Abstract
This paper proposes a novel framework for real-time adaptive-bitrate video streaming by integrating latent diffusion models (LDMs) within the FFmpeg techniques. This solution addresses the challenges of high bandwidth usage, storage inefficiencies, and quality of experience (QoE) degradation associated with traditional constant bitrate streaming (CBS) and adaptive bitrate streaming (ABS). The proposed approach leverages LDMs to compress I-frames into a latent space, offering significant storage and semantic transmission savings without sacrificing high visual quality. While it keeps B-frames and P-frames as adjustment metadata to ensure efficient video reconstruction at the user side, the proposed framework is complemented with the most state-of-the-art denoising and video frame interpolation (VFI) techniques. These techniques mitigate semantic ambiguity and restore temporal coherence between frames, even in noisy wireless communication environments. Experimental results demonstrate the proposed method achieves high-quality video streaming with optimized bandwidth usage, outperforming state-of-the-art solutions in terms of QoE and resource efficiency. This work opens new possibilities for scalable real-time video streaming in 5G and future post-5G networks.
Problem

Research questions and friction points this paper is trying to address.

Real-time adaptive-bitrate video streaming
High bandwidth usage and storage inefficiencies
Quality of experience degradation in wireless networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent Diffusion Models integration
Compress I-frames efficiently
Denoising and interpolation enhancement
🔎 Similar Papers