MultiEgo: A Multi-View Egocentric Video Dataset for 4D Scene Reconstruction

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

Existing dynamic-scene 4D reconstruction lacks support from multi-view egocentric data. To address this, we introduce the first multi-view selfie video dataset tailored for dynamic social scenarios—covering five real-world settings (e.g., conferences, performances)—each captured simultaneously by five synchronized AR glasses with sub-millisecond temporal alignment and high-precision pose annotations. We propose a custom hardware synchronization system for AR-glass arrays, a unified pipeline for multi-camera calibration and pose estimation, and a comprehensive evaluation framework for 4D reconstruction and free-viewpoint video (FVV) generation. Experiments demonstrate that our dataset and methodology significantly outperform existing baselines on FVV synthesis, effectively bridging the gap in both data and methodology for multi-view egocentric reconstruction of dynamic social interactions. This work establishes the first reproducible benchmark and releases all resources—including data, code, and models—as open-source.

Technology Category

Application Category

📝 Abstract

Multi-view egocentric dynamic scene reconstruction holds significant research value for applications in holographic documentation of social interactions. However, existing reconstruction datasets focus on static multi-view or single-egocentric view setups, lacking multi-view egocentric datasets for dynamic scene reconstruction. Therefore, we present MultiEgo, the first multi-view egocentric dataset for 4D dynamic scene reconstruction. The dataset comprises five canonical social interaction scenes: meetings, performances, and a presentation. Each scene provides five authentic egocentric videos captured by participants wearing AR glasses. We design a hardware-based data acquisition system and processing pipeline, achieving sub-millisecond temporal synchronization across views, coupled with accurate pose annotations. Experiment validation demonstrates the practical utility and effectiveness of our dataset for free-viewpoint video (FVV) applications, establishing MultiEgo as a foundational resource for advancing multi-view egocentric dynamic scene reconstruction research.

Problem

Research questions and friction points this paper is trying to address.

Lack of multi-view egocentric datasets for dynamic scenes

Need for synchronized multi-view egocentric video data

Advancing 4D reconstruction for social interaction applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-view egocentric dataset for 4D dynamic scenes

Hardware-based system with sub-millisecond synchronization

Accurate pose annotations for free-viewpoint video applications

🔎 Similar Papers

No similar papers found.

OpenAI

$380K – $445K • Offers Equity

San Francisco, CA, USA

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)