🤖 AI Summary
Visual motor learning for real-world robots suffers from low sample efficiency and overly restrictive assumptions of isometric symmetries. Method: We propose Multi-Group Equivariant Augmentation (MEA), the first approach to model non-isometric symmetries as spatiotemporally decoupled, independent group transformations—thereby relaxing the isometry constraint. To support this, we introduce a novel POMDP framework accommodating non-isometric symmetries and design a translation-equivariant voxelized visual representation. Our method integrates offline reinforcement learning with MEA-based data augmentation. Contribution/Results: We evaluate MEA on two manipulation tasks in both simulation and on real robotic platforms. Experiments demonstrate significant improvements in sample efficiency and policy performance over baseline methods, validating the effectiveness and generalizability of non-isometric symmetry modeling for visual motor learning.
📝 Abstract
Sampling efficiency is critical for deploying visuomotor learning in real-world robotic manipulation. While task symmetry has emerged as a promising inductive bias to improve efficiency, most prior work is limited to isometric symmetries -- applying the same group transformation to all task objects across all timesteps. In this work, we explore non-isometric symmetries, applying multiple independent group transformations across spatial and temporal dimensions to relax these constraints. We introduce a novel formulation of the partially observable Markov decision process (POMDP) that incorporates the non-isometric symmetry structures, and propose a simple yet effective data augmentation method, Multi-Group Equivariance Augmentation (MEA). We integrate MEA with offline reinforcement learning to enhance sampling efficiency, and introduce a voxel-based visual representation that preserves translational equivariance. Extensive simulation and real-robot experiments across two manipulation domains demonstrate the effectiveness of our approach.