TransVDM: Motion-Constrained Video Diffusion Model for Transparent Video Synthesis

📅 2025-02-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing video diffusion models (VDMs) lack explicit modeling of transparency (alpha channels), limiting the quality of transparent video generation. To address this, we propose the first transparent video diffusion framework, comprising: (1) a Transparent Variational Autoencoder (TVAE) that disentangles alpha and RGB representations; (2) an Alpha Motion Constraint Module (AMCM) that enforces temporal consistency in transparent regions via motion-guided regularization; and (3) the first large-scale transparent video dataset (250K frames). Integrated into a pre-trained UNet-based diffusion backbone, our framework significantly improves alpha prediction accuracy and structural-motion coherence. Extensive experiments demonstrate superior performance over state-of-the-art methods across multiple benchmarks, effectively suppressing artifacts and alpha drift in transparent regions.

Technology Category

Application Category

📝 Abstract
Recent developments in Video Diffusion Models (VDMs) have demonstrated remarkable capability to generate high-quality video content. Nonetheless, the potential of VDMs for creating transparent videos remains largely uncharted. In this paper, we introduce TransVDM, the first diffusion-based model specifically designed for transparent video generation. TransVDM integrates a Transparent Variational Autoencoder (TVAE) and a pretrained UNet-based VDM, along with a novel Alpha Motion Constraint Module (AMCM). The TVAE captures the alpha channel transparency of video frames and encodes it into the latent space of the VDMs, facilitating a seamless transition to transparent video diffusion models. To improve the detection of transparent areas, the AMCM integrates motion constraints from the foreground within the VDM, helping to reduce undesirable artifacts. Moreover, we curate a dataset containing 250K transparent frames for training. Experimental results demonstrate the effectiveness of our approach across various benchmarks.
Problem

Research questions and friction points this paper is trying to address.

Transparent video synthesis
Diffusion-based model
Alpha Motion Constraint Module
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transparent Variational Autoencoder integration
Alpha Motion Constraint Module application
Pretrained UNet-based VDM utilization
🔎 Similar Papers
No similar papers found.