Bias for Action: Video Implicit Neural Representations with Bias Modulation

📅 2025-01-16

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

Existing video processing methods exhibit limited performance in slow-motion synthesis, super-resolution, denoising, and inpainting. This paper introduces ActINR, the first framework to uncover the intrinsic suitability of implicit neural representations (INRs) for inter-frame motion modeling—specifically leveraging INR bias parameters as learnable temporal priors. ActINR formulates the INR as a learnable dictionary and employs a time-conditioned MLP to dynamically generate temporally adaptive bias terms, enabling joint end-to-end optimization with shared weights. The framework unifies multiple video restoration tasks, including 10× slow-motion generation, coupled 4× super-resolution with 2× slow-motion reconstruction, and video denoising and inpainting. Evaluated on standard benchmarks, ActINR achieves an average PSNR gain exceeding 6 dB over prior methods. This work advances both the theoretical understanding of INRs in dynamic video modeling and their practical effectiveness, establishing a unified, parameter-efficient paradigm for spatiotemporal video representation learning.

Technology Category

Application Category

📝 Abstract

We propose a new continuous video modeling framework based on implicit neural representations (INRs) called ActINR. At the core of our approach is the observation that INRs can be considered as a learnable dictionary, with the shapes of the basis functions governed by the weights of the INR, and their locations governed by the biases. Given compact non-linear activation functions, we hypothesize that an INR's biases are suitable to capture motion across images, and facilitate compact representations for video sequences. Using these observations, we design ActINR to share INR weights across frames of a video sequence, while using unique biases for each frame. We further model the biases as the output of a separate INR conditioned on time index to promote smoothness. By training the video INR and this bias INR together, we demonstrate unique capabilities, including $10 imes$ video slow motion, $4 imes$ spatial super resolution along with $2 imes$ slow motion, denoising, and video inpainting. ActINR performs remarkably well across numerous video processing tasks (often achieving more than 6dB improvement), setting a new standard for continuous modeling of videos.

Problem

Research questions and friction points this paper is trying to address.

Video Processing

Quality Enhancement

Data Analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

ActINR

Implicit Neural Representations

Offset Modulation

🔎 Similar Papers

Chrono: A Simple Blueprint for Representing Time in MLLMs