Point Bridge: 3D Representations for Cross Domain Policy Learning

📅 2026-01-22

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This work addresses the visual domain gap between simulation and reality by proposing a unified, domain-agnostic point cloud representation framework that does not require explicit visual or object alignment. The approach integrates semantic features extracted from a vision-language model with a Transformer-based policy network, enabling robot policies to be trained exclusively on synthetic data while remaining effective in real-world settings. It further supports joint training with a small number of real-world demonstrations. In both single-task and multi-task scenarios, the method achieves up to a 44% improvement in zero-shot transfer success rate over existing approaches, and when augmented with limited real data, performance gains increase further to 66%, substantially outperforming current state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

Robot foundation models are beginning to deliver on the promise of generalist robotic agents, yet progress remains constrained by the scarcity of large-scale real-world manipulation datasets. Simulation and synthetic data generation offer a scalable alternative, but their usefulness is limited by the visual domain gap between simulation and reality. In this work, we present Point Bridge, a framework that leverages unified, domain-agnostic point-based representations to unlock synthetic datasets for zero-shot sim-to-real policy transfer, without explicit visual or object-level alignment. Point Bridge combines automated point-based representation extraction via Vision-Language Models (VLMs), transformer-based policy learning, and efficient inference-time pipelines to train capable real-world manipulation agents using only synthetic data. With additional co-training on small sets of real demonstrations, Point Bridge further improves performance, substantially outperforming prior vision-based sim-and-real co-training methods. It achieves up to 44% gains in zero-shot sim-to-real transfer and up to 66% with limited real data across both single-task and multitask settings. Videos of the robot are best viewed at: https://pointbridge3d.github.io/

Problem

Research questions and friction points this paper is trying to address.

sim-to-real transfer

domain gap

synthetic data

robot manipulation

zero-shot policy transfer

Innovation

Methods, ideas, or system contributions that make the work stand out.

point-based representation

zero-shot sim-to-real transfer

vision-language models

transformer policy learning

cross-domain policy learning

🔎 Similar Papers

No similar papers found.

Field AI

Irvine, CA

Research Scientist Intern, Robotic Control Policy (PhD)