🤖 AI Summary
Robot cloth manipulation suffers from a significant simulation-to-reality (Sim2Real) visual domain gap, primarily due to the high cost of acquiring real-world annotated data, which hinders the development of perception–planning closed-loop systems. To address this, we propose GARField—a differentiable Grid-Attached Radiance Field—that explicitly and differentiably binds dynamic cloth triangle meshes with neural radiance fields (NeRFs) for the first time. GARField enables an end-to-end, differentiable mapping from simulated mesh states to photorealistic RGB observations, eliminating the need for real-world annotations and establishing a novel “state-to-vision” generative paradigm. Evaluated on cloth pose estimation and manipulation policy transfer, GARField achieves state-of-the-art performance with markedly improved cross-domain generalization. We publicly release our code to advance textile manipulation research toward data-efficient, physics-informed generative modeling.
📝 Abstract
While humans intuitively manipulate garments and other textile items swiftly and accurately, it is a significant challenge for robots. A factor crucial to human performance is the ability to imagine, a priori, the intended result of the manipulation intents and hence develop predictions on the garment pose. That ability allows us to plan from highly obstructed states, adapt our plans as we collect more information and react swiftly to unforeseen circumstances. Conversely, robots struggle to establish such intuitions and form tight links between plans and observations. We can partly attribute this to the high cost of obtaining densely labelled data for textile manipulation, both in quality and quantity. The problem of data collection is a long-standing issue in data-based approaches to garment manipulation. As of today, generating high-quality and labelled garment manipulation data is mainly attempted through advanced data capture procedures that create simplified state estimations from real-world observations. However, this work proposes a novel approach to the problem by generating real-world observations from object states. To achieve this, we present GARField (Garment Attached Radiance Field), the first differentiable rendering architecture, to our knowledge, for data generation from simulated states stored as triangle meshes. Code is available on the project website.