TwoSquared: 4D Generation from 2D Image Pairs

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

This paper addresses the challenging problem of generating physically plausible 4D dynamic object sequences from only two 2D RGB images depicting initial and final states. We propose a two-stage decoupled framework that requires no 3D templates, category priors, or in-the-wild image registration. First, we leverage a pretrained generative image-to-3D reconstruction model to recover geometry and texture for both end states. Second, a differentiable, physics-driven deformation module evolves intermediate frames via latent-space interpolation, ensuring motion plausibility and strict geometric-textural consistency. The method achieves high-fidelity, unsupervised 4D sequence generation—significantly reducing reliance on large-scale 4D annotated datasets and high computational resources. To our knowledge, it is the first approach to enable end-to-end synthesis of spatiotemporally coherent 4D dynamic content from just two input RGB images.

Technology Category

Application Category

📝 Abstract

Despite the astonishing progress in generative AI, 4D dynamic object generation remains an open challenge. With limited high-quality training data and heavy computing requirements, the combination of hallucinating unseen geometry together with unseen movement poses great challenges to generative models. In this work, we propose TwoSquared as a method to obtain a 4D physically plausible sequence starting from only two 2D RGB images corresponding to the beginning and end of the action. Instead of directly solving the 4D generation problem, TwoSquared decomposes the problem into two steps: 1) an image-to-3D module generation based on the existing generative model trained on high-quality 3D assets, and 2) a physically inspired deformation module to predict intermediate movements. To this end, our method does not require templates or object-class-specific prior knowledge and can take in-the-wild images as input. In our experiments, we demonstrate that TwoSquared is capable of producing texture-consistent and geometry-consistent 4D sequences only given 2D images.

Problem

Research questions and friction points this paper is trying to address.

Generating 4D dynamic objects from limited 2D images

Overcoming challenges in unseen geometry and movement prediction

Creating texture-consistent 4D sequences without class-specific templates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates 4D sequences from 2D image pairs

Uses image-to-3D and deformation modules

Requires no templates or class-specific priors

🔎 Similar Papers

No similar papers found.