Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset

📅 2025-06-04

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing egocentric 3D datasets suffer from insufficient ground-truth 3D geometry, limited diurnal illumination variation, and incomplete 6-degree-of-freedom (6DoF) pose annotations, hindering robust evaluation of egocentric 3D perception. To address this, we introduce the first large-scale, ground-truth-complete, day-night dual-modal egocentric 3D dataset—capturing 30 km of trajectories across 40,000 m² of real-world scenes. It provides high-accuracy 6DoF camera poses, densely reconstructed point clouds, cross-temporal geometric alignment, and multi-epoch illumination labels. Leveraging Meta ARIA glasses, we acquire synchronized video streams and employ multi-session SLAM for precise pose estimation and dense reconstruction. The dataset enables two novel benchmarks: novel-view synthesis and visual relocalization under extreme lighting conditions. It fills a critical gap in egocentric 3D perception evaluation, significantly enhancing the capacity to validate model generalization and robustness in low-light and dynamically lit real-world scenarios.

Technology Category

Application Category

📝 Abstract

We introduce Oxford Day-and-Night, a large-scale, egocentric dataset for novel view synthesis (NVS) and visual relocalisation under challenging lighting conditions. Existing datasets often lack crucial combinations of features such as ground-truth 3D geometry, wide-ranging lighting variation, and full 6DoF motion. Oxford Day-and-Night addresses these gaps by leveraging Meta ARIA glasses to capture egocentric video and applying multi-session SLAM to estimate camera poses, reconstruct 3D point clouds, and align sequences captured under varying lighting conditions, including both day and night. The dataset spans over 30 $mathrm{km}$ of recorded trajectories and covers an area of 40,000 $mathrm{m}^2$, offering a rich foundation for egocentric 3D vision research. It supports two core benchmarks, NVS and relocalisation, providing a unique platform for evaluating models in realistic and diverse environments.

Problem

Research questions and friction points this paper is trying to address.

Addressing lack of egocentric datasets with 3D geometry and lighting variation

Providing benchmarks for novel view synthesis in diverse lighting conditions

Enabling visual relocalisation research with large-scale day-night egocentric data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging Meta ARIA glasses for egocentric video

Applying multi-session SLAM for 3D reconstruction

Aligning sequences under varying lighting conditions

🔎 Similar Papers

A New Dataset for Monocular Depth Estimation Under Viewpoint Shifts

2024-09-26arXiv.orgCitations: 0