Contact-Anchored Policies: Contact Conditioning Creates Strong Robot Utility Models

📅 2026-02-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Language instructions are often too abstract to support robust robotic manipulation in complex physical interactions. To address this limitation, this work proposes replacing linguistic commands with spatial contact points as the conditioning signal for policy learning, constructing a modular utility model library, and integrating it with EgoGym—a lightweight simulation platform—to enable rapid real-to-sim closed-loop iteration. Using only 23 hours of demonstration data, the approach achieves out-of-the-box, zero-shot generalization across environments and robot embodiments on three fundamental manipulation tasks, outperforming state-of-the-art vision-language-action models by 56% in performance. The core innovation lies in the novel contact-point-conditioned policy architecture, which substantially enhances both generalization capability and deployment efficiency.

Technology Category

Application Category

📝 Abstract
The prevalent paradigm in robot learning attempts to generalize across environments, embodiments, and tasks with language prompts at runtime. A fundamental tension limits this approach: language is often too abstract to guide the concrete physical understanding required for robust manipulation. In this work, we introduce Contact-Anchored Policies (CAP), which replace language conditioning with points of physical contact in space. Simultaneously, we structure CAP as a library of modular utility models rather than a monolithic generalist policy. This factorization allows us to implement a real-to-sim iteration cycle: we build EgoGym, a lightweight simulation benchmark, to rapidly identify failure modes and refine our models and datasets prior to real-world deployment. We show that by conditioning on contact and iterating via simulation, CAP generalizes to novel environments and embodiments out of the box on three fundamental manipulation skills while using only 23 hours of demonstration data, and outperforms large, state-of-the-art VLAs in zero-shot evaluations by 56%. All model checkpoints, codebase, hardware, simulation, and datasets will be open-sourced. Project page: https://cap-policy.github.io/
Problem

Research questions and friction points this paper is trying to address.

robot learning
language conditioning
physical manipulation
generalization
contact understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contact-Anchored Policies
modular utility models
real-to-sim iteration
physical contact conditioning
EgoGym
Z
Zichen Jeff Cui
New York University
Omar Rayyan
Omar Rayyan
UCLA
RoboticsMachine Learning
Haritheja Etukuru
Haritheja Etukuru
UC Berkeley
RoboticsMachine Learning
Bowen Tan
Bowen Tan
Carnegie Mellon University
Z
Zavier Andrianarivo
New York University
Z
Zicheng Teng
New York University
Yihang Zhou
Yihang Zhou
Shenzhen Institute of Advanced Technology,Chinese Academy of Sciences
MRIRadiotherapyRadiomicsDeep Learning
K
Krish Mehta
University of Waterloo
N
Nicholas Wojno
New York University
K
Kevin Yuanbo Wu
New York University
M
Manan H Anjaria
New York University
Z
Ziyuan Wu
New York University
M
Manrong Mao
New York University
G
Guangxun Zhang
New York University
B
Binit Shah
Hello Robot Inc.
Yejin Kim
Yejin Kim
Ai2
Soumith Chintala
Soumith Chintala
Meta AI
Artificial IntelligenceDeep LearningMachine LearningComputer Vision
Lerrel Pinto
Lerrel Pinto
New York University
RoboticsMachine Learning
Nur Muhammad Mahi Shafiullah
Nur Muhammad Mahi Shafiullah
Postdoctoral Researcher, Meta AI & Berkeley AI Research (BAIR)
RoboticsMachine learning