GenAI-enabled Residual Motion Estimation for Energy-Efficient Semantic Video Communication

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high latency, excessive bitrate, and elevated power consumption in semantic video communication, this paper proposes a predictive-perception- and entropy-adaptive neural motion estimation framework. Its core contributions are: (1) the first five-dimensional policy-driven dynamic scheduling mechanism for residual motion modeling; (2) a predictive-triggered lightweight Latent Consistency Model (LCM-4) for selective diffusion-based refinement; and (3) a channel-state- and residual-feature-jointly-driven intelligent wireless resource block allocation strategy. Evaluated on the Vimeo90K dataset, the proposed framework achieves a 40% reduction in end-to-end latency, a 90% decrease in transmission bitrate, and a 35% increase in throughput compared to baseline methods. Reconstruction quality is significantly improved: PSNR increases by 40%, MS-SSIM by 19%, and LPIPS decreases by 35%.

Technology Category

Application Category

📝 Abstract
Semantic communication addresses the limitations of the Shannon paradigm by focusing on transmitting meaning rather than exact representations, thereby reducing unnecessary resource consumption. This is particularly beneficial for video, which dominates network traffic and demands high bandwidth and power, making semantic approaches ideal for conserving resources while maintaining quality. In this paper, we propose a Predictability-aware and Entropy-adaptive Neural Motion Estimation (PENME) method to address challenges related to high latency, high bitrate, and power consumption in video transmission. PENME makes per-frame decisions to select a residual motion extraction model, convolutional neural network, vision transformer, or optical flow, using a five-step policy based on motion strength, global motion consistency, peak sharpness, heterogeneity, and residual error. The residual motions are then transmitted to the receiver, where the frames are reconstructed via motion-compensated updates. Next, a selective diffusion-based refinement, the Latent Consistency Model (LCM-4), is applied on frames that trigger refinement due to low predictability or large residuals, while predictable frames skip refinement. PENME also allocates radio resource blocks with awareness of residual motion and channel state, reducing power consumption and bandwidth usage while maintaining high semantic similarity. Our simulation results on the Vimeo90K dataset demonstrate that the proposed PENME method handles various types of video, outperforming traditional communication, hybrid, and adaptive bitrate semantic communication techniques, achieving 40% lower latency, 90% less transmitted data, and 35% higher throughput. For semantic communication metrics, PENME improves PSNR by about 40%, increases MS-SSIM by roughly 19%, and reduces LPIPS by nearly 35%, compared with the baseline methods.
Problem

Research questions and friction points this paper is trying to address.

Reducing latency, bitrate, and power in video transmission
Selecting optimal motion extraction models per frame adaptively
Allocating radio resources efficiently to conserve bandwidth and power
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive neural motion estimation for video transmission
Selective diffusion refinement based on predictability
Radio resource allocation aware of motion and channel
🔎 Similar Papers
S
Shavbo Salehi
School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada
P
Pedro Enrique Iturria-Rivera
Ericsson Canada Inc., Ottawa, Canada
Medhat Elsayed
Medhat Elsayed
Ottawa University, PhD, SMIEEE
AI-enabled wireless networksAdversarial MLIntent-driven networksLLMs/GenAI6G.
M
Majid Bavand
Ericsson Canada Inc., Ottawa, Canada
Y
Yigit Ozcan
Ericsson Canada Inc., Ottawa, Canada
Melike Erol-Kantarci
Melike Erol-Kantarci
Canada Research Chair & Professor, University of Ottawa and Sr. Product Manager for AI RAN, Ericsson
AI-enabled wireless networksAIGenAI5G6GO-RANsmart gridAI\GenAI5G\6G\O-RAN