🤖 AI Summary
This work addresses the problem of dynamic video reconstruction from static images. We propose an unsupervised animation generation method based on image patch reconstruction. Methodologically, we first apply k-means clustering over local image patches to model structural priors; then, cross-patch matching and stochastic sampling enable semantic-preserving, block-level recomposition—facilitating cross-domain animation generation where source and target domains differ in high-level semantics but share low-level structural consistency. Our key contribution lies in abandoning pixel-level copying in favor of local-structure-driven creative reconstruction, significantly enhancing animation diversity and naturalness. Experiments demonstrate that our approach generates high-quality, visually coherent, and physically plausible animations without requiring video supervision or optical flow estimation. It outperforms existing static-to-animation baselines in both qualitative and quantitative evaluations.
📝 Abstract
We present a patch-based image reconstruction and animation method that uses existing image data to bring still images to life through motion. Image patches from curated datasets are grouped using k-means clustering and a new target image is reconstructed by matching and randomly sampling from these clusters. This approach emphasizes reinterpretation over replication, allowing the source and target domains to differ conceptually while sharing local structures.