🤖 AI Summary
Existing 3D indoor datasets prioritize scale over geometric fidelity, limiting reliable evaluation of dense geometric tasks such as novel view synthesis, scene reconstruction, and SLAM. To address this, we propose the first high-accuracy, densely annotated 3D framework tailored for indoor scenes—enabling, for the first time, joint generation of full-scene dense meshes and physically accurate rendered depth maps as ground truth. Our end-to-end annotation pipeline integrates multi-view scanning, joint ICP/PnP pose calibration, differentiable rendering, and semantic-guided mesh completion. We release 11 high-fidelity indoor scenes, each with precise camera poses, complete object-level meshes, and pixel-accurate depth ground truth. This dataset establishes a new benchmark for reconstruction, SLAM, and 6D object pose estimation, significantly improving the accuracy and reliability of dense geometric evaluation.
📝 Abstract
Traditionally, 3d indoor datasets have generally prioritized scale over ground-truth accuracy in order to obtain improved generalization. However, using these datasets to evaluate dense geometry tasks, such as depth rendering, can be problematic as the meshes of the dataset are often incomplete and may produce wrong ground truth to evaluate the details. In this paper, we propose SCRREAM, a dataset annotation framework that allows annotation of fully dense meshes of objects in the scene and registers camera poses on the real image sequence, which can produce accurate ground truth for both sparse 3D as well as dense 3D tasks. We show the details of the dataset annotation pipeline and showcase four possible variants of datasets that can be obtained from our framework with example scenes, such as indoor reconstruction and SLAM, scene editing&object removal, human reconstruction and 6d pose estimation. Recent pipelines for indoor reconstruction and SLAM serve as new benchmarks. In contrast to previous indoor dataset, our design allows to evaluate dense geometry tasks on eleven sample scenes against accurately rendered ground truth depth maps.