ExoGS: A 4D Real-to-Sim-to-Real Framework for Scalable Manipulation Data Collection

📅 2026-01-26

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the limitations of existing real-to-sim-to-real approaches, which predominantly focus on visual transfer and struggle to effectively model dynamic physical interactions in the real world—particularly hindering data generation for contact-rich manipulation tasks. To overcome this, the authors propose the ExoGS framework, which leverages a custom passive exoskeleton, AirExo-3, to synchronously capture high-fidelity human motion and RGB images. These recordings are reconstructed into editable 4D dynamic 3D Gaussian Splatting scenes, enabling geometrically consistent replay and large-scale data augmentation in simulation. By integrating a lightweight semantic adapter, the method achieves, for the first time, seamless transfer of real-world dynamic interactions into simulation. The approach significantly outperforms conventional teleoperation baselines, enhancing both data efficiency and cross-domain generalization of policies in real-world settings. Code and hardware designs are publicly released.

Technology Category

Application Category

📝 Abstract

Real-to-Sim-to-Real technique is gaining increasing interest for robotic manipulation, as it can generate scalable data in simulation while having narrower sim-to-real gap. However, previous methods mainly focused on environment-level visual real-to-sim transfer, ignoring the transfer of interactions, which could be challenging and inefficient to obtain purely in simulation especially for contact-rich tasks. We propose ExoGS, a robot-free 4D Real-to-Sim-to-Real framework that captures both static environments and dynamic interactions in the real world and transfers them seamlessly to a simulated environment. It provides a new solution for scalable manipulation data collection and policy learning. ExoGS employs a self-designed robot-isomorphic passive exoskeleton AirExo-3 to capture kinematically consistent trajectories with millimeter-level accuracy and synchronized RGB observations during direct human demonstrations. The robot, objects, and environment are reconstructed as editable 3D Gaussian Splatting assets, enabling geometry-consistent replay and large-scale data augmentation. Additionally, a lightweight Mask Adapter injects instance-level semantics into the policy to enhance robustness under visual domain shifts. Real-world experiments demonstrate that ExoGS significantly improves data efficiency and policy generalization compared to teleoperation-based baselines. Code and hardware files have been released on https://github.com/zaixiabalala/ExoGS.

Problem

Research questions and friction points this paper is trying to address.

Real-to-Sim-to-Real

robotic manipulation

dynamic interaction transfer

contact-rich tasks

scalable data collection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-to-Sim-to-Real

4D manipulation data

3D Gaussian Splatting

passive exoskeleton

Mask Adapter

🔎 Similar Papers

FastUMI: A Scalable and Hardware-Independent Universal Manipulation Interface with Dataset

2024-09-29Citations: 0

ParaHome: Parameterizing Everyday Home Activities Towards 3D Generative Modeling of Human-Object Interactions

2024-01-18arXiv.orgCitations: 8

HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

2024-06-10arXiv.orgCitations: 0