Believing is Seeing: Unobserved Object Detection using Generative Models

📅 2024-10-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper introduces the novel task of *unobserved object detection*, aiming to detect and localize objects that are either occluded or outside the field of view yet physically proximal in a single image. Methodologically, it proposes a counterfactual generative reasoning framework that jointly leverages 2D/3D diffusion models, vision-language models (VLMs), multi-view geometric priors, and cross-modal feature alignment. Key contributions include: (1) the first formal definition of unobserved object detection across 2D, 2.5D, and 3D settings; (2) the first comprehensive evaluation benchmark incorporating metrics for localization accuracy, geometric plausibility, and semantic consistency; and (3) empirical validation showing that pre-trained generative models possess strong implicit scene understanding—achieving significant improvements over conventional detection and 3D reconstruction baselines on RealEstate10k and NYU Depth v2. The work establishes a new paradigm for spatial reasoning grounded in generative priors.

Technology Category

Application Category

📝 Abstract
Can objects that are not visible in an image -- but are in the vicinity of the camera -- be detected? This study introduces the novel tasks of 2D, 2.5D and 3D unobserved object detection for predicting the location of nearby objects that are occluded or lie outside the image frame. We adapt several state-of-the-art pre-trained generative models to address this task, including 2D and 3D diffusion models and vision-language models, and show that they can be used to infer the presence of objects that are not directly observed. To benchmark this task, we propose a suite of metrics that capture different aspects of performance. Our empirical evaluation on indoor scenes from the RealEstate10k and NYU Depth v2 datasets demonstrate results that motivate the use of generative models for the unobserved object detection task.
Problem

Research questions and friction points this paper is trying to address.

Detecting nearby objects not visible in images
Using generative models for unobserved object detection
Benchmarking with new metrics for performance evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts pre-trained generative models
Uses 2D, 2.5D, and 3D diffusion models
Proposes metrics for unobserved object detection
🔎 Similar Papers
No similar papers found.
S
Subhransu S. Bhattacharjee
The Australian National University
Dylan Campbell
Dylan Campbell
Lecturer, Australian National University
RegistrationGlobal optimization3D Reconstruction3D/Stereo Scene Analysis
R
Rahul Shome
The Australian National University