Novel Demonstration Generation with Gaussian Splatting Enables Robust One-Shot Manipulation

📅 2025-04-17

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the scarcity, high cost, and limited diversity of real-world teleoperation demonstration data. To this end, we propose a single-example, multi-dimensional demonstration generation framework based on 3D Gaussian Splatting (3DGS). Methodologically, we introduce the first approach to directly edit scenes within the 3D Gaussian representation space—including object replacement, equivariant spatial transformations, and illumination, appearance, viewpoint, and novel content editing—thereby enabling consistent augmentation across six generalization dimensions: object category, pose, appearance, lighting, viewpoint, and robot embodiment. Our framework overcomes modeling bottlenecks inherent in conventional 2D data augmentation and physics-based simulation. In real-robot experiments, it achieves an average success rate of 87.8% using only a single demonstration—substantially outperforming baseline methods (57.2%)—and demonstrates strong robustness and cross-dimensional generalization under multiple perturbations.

Technology Category

Application Category

📝 Abstract

Visuomotor policies learned from teleoperated demonstrations face challenges such as lengthy data collection, high costs, and limited data diversity. Existing approaches address these issues by augmenting image observations in RGB space or employing Real-to-Sim-to-Real pipelines based on physical simulators. However, the former is constrained to 2D data augmentation, while the latter suffers from imprecise physical simulation caused by inaccurate geometric reconstruction. This paper introduces RoboSplat, a novel method that generates diverse, visually realistic demonstrations by directly manipulating 3D Gaussians. Specifically, we reconstruct the scene through 3D Gaussian Splatting (3DGS), directly edit the reconstructed scene, and augment data across six types of generalization with five techniques: 3D Gaussian replacement for varying object types, scene appearance, and robot embodiments; equivariant transformations for different object poses; visual attribute editing for various lighting conditions; novel view synthesis for new camera perspectives; and 3D content generation for diverse object types. Comprehensive real-world experiments demonstrate that RoboSplat significantly enhances the generalization of visuomotor policies under diverse disturbances. Notably, while policies trained on hundreds of real-world demonstrations with additional 2D data augmentation achieve an average success rate of 57.2%, RoboSplat attains 87.8% in one-shot settings across six types of generalization in the real world.

Problem

Research questions and friction points this paper is trying to address.

Overcoming limited diversity in visuomotor policy training data

Addressing imprecise geometric reconstruction in Real-to-Sim pipelines

Enhancing generalization of policies under diverse real-world disturbances

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses 3D Gaussian Splatting for scene reconstruction

Directly edits 3D Gaussians for data augmentation

Enhances generalization with six augmentation techniques

🔎 Similar Papers

No similar papers found.

Amazon

193,300.00 - 261,500.00 USD annually

USA, CA, San Francisco

Research Scientist Intern, Robotic Control Policy (PhD)