Bringing NeRFs to the Latent Space: Inverse Graphics Autoencoder

📅 2024-10-30
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of 3D geometric priors in 2D latent spaces, which hinders differentiable inverse graphics modeling. We propose the Inverse Graphics Autoencoder (IG-AE), the first method enabling differentiable inverse graphics within a 2D latent space by jointly optimizing shared latent representations for both images and implicit 3D scenes. IG-AE establishes explicit alignment between image latent variables and geometrically consistent 3D latent scenes. Technically, it integrates latent-space regularization, joint 3D–2D representation learning, NeRF-based implicit field distillation, and extensions to the Nerfstudio framework. Experiments demonstrate that Latent NeRF—our instantiation of IG-AE—achieves superior reconstruction quality over standard autoencoder baselines, trains and renders significantly faster than pixel-level NeRFs, and natively interoperates with diverse latent-space vision methods. Thus, IG-AE provides an efficient, scalable, cross-modal interface bridging generative modeling and 3D scene understanding.

Technology Category

Application Category

📝 Abstract
While pre-trained image autoencoders are increasingly utilized in computer vision, the application of inverse graphics in 2D latent spaces has been under-explored. Yet, besides reducing the training and rendering complexity, applying inverse graphics in the latent space enables a valuable interoperability with other latent-based 2D methods. The major challenge is that inverse graphics cannot be directly applied to such image latent spaces because they lack an underlying 3D geometry. In this paper, we propose an Inverse Graphics Autoencoder (IG-AE) that specifically addresses this issue. To this end, we regularize an image autoencoder with 3D-geometry by aligning its latent space with jointly trained latent 3D scenes. We utilize the trained IG-AE to bring NeRFs to the latent space with a latent NeRF training pipeline, which we implement in an open-source extension of the Nerfstudio framework, thereby unlocking latent scene learning for its supported methods. We experimentally confirm that Latent NeRFs trained with IG-AE present an improved quality compared to a standard autoencoder, all while exhibiting training and rendering accelerations with respect to NeRFs trained in the image space. Our project page can be found at https://ig-ae.github.io .
Problem

Research questions and friction points this paper is trying to address.

Applies inverse graphics in latent space
Introduces Inverse Graphics Autoencoder (IG-AE)
Enhances NeRFs training and rendering efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Inverse Graphics Autoencoder for latent space
Latent NeRF training pipeline implemented
3D-geometry regularization in autoencoder
🔎 Similar Papers
No similar papers found.
A
Antoine Schnepf
Université Côte d'Azur, CNRS, I3S, France; Criteo AI Lab, Paris, France
K
Karim Kassab
Criteo AI Lab, Paris, France; LASTIG, Université Gustave Eiffel, IGN-ENSG, F-94160 Saint-Mandé
Jean-Yves Franceschi
Jean-Yves Franceschi
Criteo AI Lab
AIMachine LearningDeep Learning
Laurent Caraffa
Laurent Caraffa
Researcher
Computer visionMarkov Random Fieldimage defoggingComputational geometry
Flavian Vasile
Flavian Vasile
Criteo
J
Jeremie Mary
Criteo AI Lab, Paris, France
A
Andrew Comport
Université Côte d'Azur, CNRS, I3S, France
V
Valérie Gouet-Brunet
LASTIG, Université Gustave Eiffel, IGN-ENSG, F-94160 Saint-Mandé