🤖 AI Summary
Robotics foundation models often learn spurious visual-action correlations from pretraining trajectories, severely limiting cross-domain generalization. To address this, we propose Policy Contrastive Decoding (PCD), a zero-shot, training-free post-hoc method that enhances causal reasoning by masking salient objects and contrasting the resulting action probability distributions before and after perturbation—thereby steering policy attention toward genuine causal cues. PCD requires no model fine-tuning, architectural modification, weight access, or additional training; it is fully plug-and-play and compatible with both autoregressive (e.g., OpenVLA) and diffusion-based (e.g., Octo, π₀) policies. Extensive experiments in simulation and on real robots demonstrate substantial improvements in generalization robustness: PCD boosts the performance of the π₀ policy by 8% in simulation and 108% on physical hardware. These results validate PCD’s effectiveness as a universal, lightweight enhancement for diverse open-source robotic policies.
📝 Abstract
Robotic foundation models, or generalist robot policies, hold immense potential to enable flexible, general-purpose and dexterous robotic systems. Despite their advancements, our empirical experiments reveal that existing robot policies are prone to learning spurious correlations from pre-training trajectories, adversely affecting their generalization capabilities beyond the training data. To tackle this, we propose a novel Policy Contrastive Decoding (PCD) approach, which redirects the robot policy's focus toward object-relevant visual clues by contrasting action probability distributions derived from original and object-masked visual inputs. As a training-free method, our PCD can be used as a plugin to improve different types of robot policies without needing to finetune or access model weights. We conduct extensive experiments on top of three open-source robot policies, including the autoregressive policy OpenVLA and the diffusion-based policies Octo and $pi_0$. The obtained results in both simulation and real-world environments prove PCD's flexibility and effectiveness, e.g., PCD enhances the state-of-the-art policy $pi_0$ by 8% in the simulation environment and by 108% in the real-world environment. Code and demos are publicly available at: https://Koorye.github.io/proj/PCD.