🤖 AI Summary
To address the weak policy generalization caused by the domain gap in robot sim-to-real transfer, this paper proposes Semantic 2D Gaussian Splatting (S2GS), a novel representation for learning domain-invariant semantic features. S2GS models object-centric multi-view semantic fields and projects features onto 2D space via Gaussian splatting, augmented with a semantic filtering mechanism to yield robust and consistent spatial feature representations. Crucially, S2GS is decoupled from the downstream policy—specifically, Diffusion Policy—enabling direct integration into cross-domain policy learning. Trained exclusively in the ManiSkill simulation environment, policies leveraging S2GS achieve zero-shot transfer to real robotic platforms without fine-tuning. On manipulation tasks—including grasping and pushing—they significantly outperform state-of-the-art domain randomization and adaptation methods, demonstrating superior generalization capability and transfer effectiveness.
📝 Abstract
Cross-domain transfer in robotic manipulation remains a longstanding challenge due to the significant domain gap between simulated and real-world environments. Existing methods such as domain randomization, adaptation, and sim-real calibration often require extensive tuning or fail to generalize to unseen scenarios. To address this issue, we observe that if domain-invariant features are utilized during policy training in simulation, and the same features can be extracted and provided as the input to policy during real-world deployment, the domain gap can be effectively bridged, leading to significantly improved policy generalization. Accordingly, we propose Semantic 2D Gaussian Splatting (S2GS), a novel representation method that extracts object-centric, domain-invariant spatial features. S2GS constructs multi-view 2D semantic fields and projects them into a unified 3D space via feature-level Gaussian splatting. A semantic filtering mechanism removes irrelevant background content, ensuring clean and consistent inputs for policy learning. To evaluate the effectiveness of S2GS, we adopt Diffusion Policy as the downstream learning algorithm and conduct experiments in the ManiSkill simulation environment, followed by real-world deployment. Results demonstrate that S2GS significantly improves sim-to-real transferability, maintaining high and stable task performance in real-world scenarios.