๐ค AI Summary
This work addresses the long-overlooked problem of modeling non-motion-driven dynamic appearance evolutionโsuch as rusting, weathering, and melting. We propose a unified framework based on Neural Ordinary Differential Equations (Neural ODEs). Methodologically, we pioneer the coupling of noise initialization with time-evolving visual statistical features within a single Neural ODE, enabling end-to-end denoising and dynamic synthesis. We introduce a two-stage temporal training strategy (warm-up followed by generation) and construct the first synchronized dual-modality dynamic material video dataset comprising 22 RGB sequences and 21 flash-illuminated BRDF sequences. Experiments demonstrate that our method generates more photorealistic and temporally coherent dynamic textures over significant time spans. A user study confirms its statistically significant superiority over existing approaches. Both code and dataset are publicly released.
๐ Abstract
We propose a method to reproduce dynamic appearance textures with space-stationary but time-varying visual statistics. While most previous work decomposes dynamic textures into static appearance and motion, we focus on dynamic appearance that results not from motion but variations of fundamental properties, such as rusting, decaying, melting, and weathering. To this end, we adopt the neural ordinary differential equation (ODE) to learn the underlying dynamics of appearance from a target exemplar. We simulate the ODE in two phases. At the "warm-up" phase, the ODE diffuses a random noise to an initial state. We then constrain the further evolution of this ODE to replicate the evolution of visual feature statistics in the exemplar during the generation phase. The particular innovation of this work is the neural ODE achieving both denoising and evolution for dynamics synthesis, with a proposed temporal training scheme. We study both relightable (BRDF) and non-relightable (RGB) appearance models. For both we introduce new pilot datasets, allowing, for the first time, to study such phenomena: For RGB we provide 22 dynamic textures acquired from free online sources; For BRDFs, we further acquire a dataset of 21 flash-lit videos of time-varying materials, enabled by a simple-to-construct setup. Our experiments show that our method consistently yields realistic and coherent results, whereas prior works falter under pronounced temporal appearance variations. A user study confirms our approach is preferred to previous work for such exemplars.