Real-Time Video Prediction With Fast Video Interpolation Model and Prediction Training

📅 2024-10-27
🏛️ International Conference on Information Photonics
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
To address perceptual latency degradation in real-time video transmission, which impairs interactive user experience, this paper proposes IFRVP—a zero-latency video prediction framework. Methodologically, we design IFRNet, a lightweight convolutional architecture incorporating ELAN-based residual modules to balance accuracy and efficiency, and introduce three novel frame interpolation training paradigms specifically tailored for predictive tasks. Furthermore, we propose a mid-level feature refinement mechanism to enable end-to-end inter-frame interpolation modeling. Experimental results demonstrate that IFRVP achieves a state-of-the-art trade-off between prediction accuracy and inference speed, enabling real-time prediction at over 30 FPS and significantly reducing end-to-end perceptual latency. The source code and demonstration videos are publicly available.

Technology Category

Application Category

📝 Abstract
Transmission latency significantly affects users' quality of experience in real-time interaction and actuation. As latency is principally inevitable, video prediction can be utilized to mitigate the latency and ultimately enable zero-latency transmission. However, most of the existing video prediction methods are computationally expensive and impractical for real-time applications. In this work, we therefore propose real-time video prediction towards the zero-latency interaction over networks, called IFRVP (Intermediate Feature Refinement Video Prediction). Firstly, we propose three training methods for video prediction that extend frame interpolation models, where we utilize a simple convolution-only frame interpolation network based on IFRNet. Secondly, we introduce ELAN-based residual blocks into the prediction models to improve both inference speed and accuracy. Our evaluations show that our proposed models perform efficiently and achieve the best trade-off between prediction accuracy and computational speed among the existing video prediction methods. A demonstration movie is also provided at http://bit.ly/IFRVPDemo.
Problem

Research questions and friction points this paper is trying to address.

Reduce transmission latency in real-time video interaction
Improve video prediction speed and computational efficiency
Enhance accuracy and speed trade-off in prediction models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast convolution-only frame interpolation network
ELAN-based residual blocks for speed and accuracy
Three training methods extending interpolation models
🔎 Similar Papers
No similar papers found.