π€ AI Summary
Weak generalization of robotic policies to unseen objects remains a longstanding challenge. This paper introduces the first data augmentation method leveraging neural radiance fields (NeRF) for zero-shot policy generalization: it employs differentiable rendering to synthesize high-fidelity, geometrically consistent 3D visual data that balances photorealism and computational efficiencyβ63% faster than existing approaches. Crucially, the method requires no real-world interaction data and directly enhances out-of-distribution generalization across object domains. Evaluated on five robotic manipulation tasks involving nine novel objects, it achieves an average performance improvement of 55.6%, substantially outperforming state-of-the-art methods. The core contribution is the first integration of NeRF into the closed-loop robotic policy training pipeline, enabling efficient, geometry-aware visual data augmentation. This establishes a new paradigm for zero-shot embodied intelligence.
π Abstract
Training a policy that can generalize to unknown objects is a long standing challenge within the field of robotics. The performance of a policy often drops significantly in situations where an object in the scene was not seen during training. To solve this problem, we present NeRF-Aug, a novel method that is capable of teaching a policy to interact with objects that are not present in the dataset. This approach differs from existing approaches by leveraging the speed, photorealism, and 3D consistency of a neural radiance field for augmentation. NeRF-Aug both creates more photorealistic data and runs 63% faster than existing methods. We demonstrate the effectiveness of our method on 5 tasks with 9 novel objects that are not present in the expert demonstrations. We achieve an average performance boost of 55.6% when comparing our method to the next best method. You can see video results at https://nerf-aug.github.io.