HyperNVD: Accelerating Neural Video Decomposition via Hypernetworks

📅 2025-03-21

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

Existing video layer decomposition models suffer from poor generalization and slow convergence, as they require training implicit neural representations (INRs) independently for each video. To address this, we propose the first hypernetwork-based meta-learning framework for video decomposition. Our method employs a video encoder to extract global semantic embeddings, which condition a hypernetwork to generate compact, video-specific INR parameters—enabling cross-video knowledge transfer and mitigating overfitting to individual videos. Evaluated on multiple benchmarks, our approach accelerates training by 3–5× while maintaining or improving layer separation quality, and supports real-time, editing-grade inference. The core contribution is the first integration of hypernetworks into video layer decomposition, establishing an efficient and generalizable meta-learning paradigm for this task.

Technology Category

Application Category

📝 Abstract

Decomposing a video into a layer-based representation is crucial for easy video editing for the creative industries, as it enables independent editing of specific layers. Existing video-layer decomposition models rely on implicit neural representations (INRs) trained independently for each video, making the process time-consuming when applied to new videos. Noticing this limitation, we propose a meta-learning strategy to learn a generic video decomposition model to speed up the training on new videos. Our model is based on a hypernetwork architecture which, given a video-encoder embedding, generates the parameters for a compact INR-based neural video decomposition model. Our strategy mitigates the problem of single-video overfitting and, importantly, shortens the convergence of video decomposition on new, unseen videos. Our code is available at: https://hypernvd.github.io/

Problem

Research questions and friction points this paper is trying to address.

Accelerates video decomposition for efficient editing

Reduces training time for new videos via meta-learning

Prevents overfitting with hypernetwork-generated INR parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning for generic video decomposition

Hypernetwork generates INR parameters

Speeds up convergence on new videos

🔎 Similar Papers

No similar papers found.