Unsupervised Discovery of Object-Centric Neural Fields

📅 2024-02-12
🏛️ arXiv.org
📈 Citations: 3
Influential: 1
📄 PDF
🤖 AI Summary
Existing unsupervised 3D object discovery methods exhibit poor generalization in real-world scenes due to entangled modeling of intrinsic object properties (e.g., shape, appearance) and observer-dependent pose (i.e., 6DoF position and orientation). Method: We propose the first object-centric Neural Radiance Fields (NeRF) framework for single real-world images. It employs explicit camera parameter decoupling (intrinsic/extrinsic), a differentiable spatial segmentation module, and self-supervised consistency regularization to achieve unsupervised disentangled 3D reconstruction—separating object shape/appearance from 6DoF pose. Contribution/Results: Evaluated on a newly constructed real-world kitchen dataset, our method achieves high-fidelity object-level 3D segmentation and editing. It discovers and reconstructs multiple objects from a single input image and demonstrates strong zero-shot generalization to unseen objects, significantly outperforming baselines in reconstruction accuracy.

Technology Category

Application Category

📝 Abstract
We study inferring 3D object-centric scene representations from a single image. While recent methods have shown potential in unsupervised 3D object discovery from simple synthetic images, they fail to generalize to real-world scenes with visually rich and diverse objects. This limitation stems from their object representations, which entangle objects' intrinsic attributes like shape and appearance with extrinsic, viewer-centric properties such as their 3D location. To address this bottleneck, we propose Unsupervised discovery of Object-Centric neural Fields (uOCF). uOCF focuses on learning the intrinsics of objects and models the extrinsics separately. Our approach significantly improves systematic generalization, thus enabling unsupervised learning of high-fidelity object-centric scene representations from sparse real-world images. To evaluate our approach, we collect three new datasets, including two real kitchen environments. Extensive experiments show that uOCF enables unsupervised discovery of visually rich objects from a single real image, allowing applications such as 3D object segmentation and scene manipulation. Notably, uOCF demonstrates zero-shot generalization to unseen objects from a single real image. Project page: https://red-fairy.github.io/uOCF/
Problem

Research questions and friction points this paper is trying to address.

Unsupervised 3D object discovery
Generalization to real-world scenes
Separation of intrinsic and extrinsic attributes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised object-centric neural fields
Separates intrinsics and extrinsics
Zero-shot generalization to unseen objects
🔎 Similar Papers
No similar papers found.
R
Rundong Luo
Stanford University
H
Hong-Xing Yu
Stanford University
J
Jiajun Wu
Stanford University