Promptus: Can Prompts Streaming Replace Video Streaming with Stable Diffusion

📅 2024-05-30

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

215K/year

🤖 AI Summary

To address the challenge of surging video traffic pushing conventional streaming toward fundamental limits in compression efficiency and bandwidth utilization, this paper proposes Promptus—a semantic communication system that transmits lightweight text prompts instead of raw video frames, enabling pixel-aligned reconstruction and adaptive bitrate control at the receiver via Stable Diffusion. We introduce a novel prompt-streaming paradigm for video transmission and propose three core techniques: gradient-descent-based prompt fitting, low-rank decomposition for bitrate regulation, and interpolation-aware inter-frame compression. Experiments demonstrate that Promptus achieves over 4× bandwidth reduction compared to H.265 at ultra-low bitrates; improves LPIPS quality by 0.139 (vs. VAE) and 0.118 (vs. H.265); and reduces the proportion of severely distorted frames by 89.3%–91.7%, significantly enhancing both semantic fidelity and transmission efficiency.

Technology Category

Application Category

📝 Abstract

With the exponential growth of video traffic, traditional video streaming systems are approaching their limits in compression efficiency and communication capacity. To further reduce bitrate while maintaining quality, we propose Promptus, a disruptive semantic communication system that streaming prompts instead of video content, which represents real-world video frames with a series of"prompts"for delivery and employs Stable Diffusion to generate videos at the receiver. To ensure that the generated video is pixel-aligned with the original video, a gradient descent-based prompt fitting framework is proposed. Further, a low-rank decomposition-based bitrate control algorithm is introduced to achieve adaptive bitrate. For inter-frame compression, an interpolation-aware fitting algorithm is proposed. Evaluations across various video genres demonstrate that, compared to H.265, Promptus can achieve more than a 4x bandwidth reduction while preserving the same perceptual quality. On the other hand, at extremely low bitrates, Promptus can enhance the perceptual quality by 0.139 and 0.118 (in LPIPS) compared to VAE and H.265, respectively, and decreases the ratio of severely distorted frames by 89.3% and 91.7%. Our work opens up a new paradigm for efficient video communication. Promptus is open-sourced at: https://github.com/JiangkaiWu/Promptus.

Problem

Research questions and friction points this paper is trying to address.

Reducing video bandwidth using prompt-based streaming

Ensuring pixel alignment with gradient descent prompts

Enhancing quality at low bitrates via Stable Diffusion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Streaming prompts instead of video content

Gradient descent-based prompt fitting framework

Low-rank decomposition-based bitrate control algorithm

🔎 Similar Papers

Pyramidal Flow Matching for Efficient Video Generative Modeling