🤖 AI Summary
Static object-level reconstruction in dynamic scenes suffers from inherent ambiguity. To address this, we propose an active online decomposition and reconstruction framework leveraging human-object interaction cues. Given first-person video input, our method jointly optimizes camera pose estimation, instance-level segmentation, and dense map updating, guided by dynamic priors derived from human motion to decouple scene components. We innovatively integrate Gaussian splatting rendering, multi-task learning, and a lightweight SLAM module to achieve real-time, temporally consistent modeling of both moving objects and static background. Extensive evaluation on multiple real-world scenes demonstrates significant improvements over existing static or passive reconstruction approaches in reconstruction accuracy, novel-view synthesis quality, and dynamic consistency. The framework enables photorealistic, efficient, and scalable dynamic scene understanding.
📝 Abstract
Human behaviors are the major causes of scene dynamics and inherently contain rich cues regarding the dynamics. This paper formalizes a new task of proactive scene decomposition and reconstruction, an online approach that leverages human-object interactions to iteratively disassemble and reconstruct the environment. By observing these intentional interactions, we can dynamically refine the decomposition and reconstruction process, addressing inherent ambiguities in static object-level reconstruction. The proposed system effectively integrates multiple tasks in dynamic environments such as accurate camera and object pose estimation, instance decomposition, and online map updating, capitalizing on cues from human-object interactions in egocentric live streams for a flexible, progressive alternative to conventional object-level reconstruction methods. Aided by the Gaussian splatting technique, accurate and consistent dynamic scene modeling is achieved with photorealistic and efficient rendering. The efficacy is validated in multiple real-world scenarios with promising advantages.