🤖 AI Summary
Underwater robotic manipulation faces significant challenges—including drastic illumination variations, color distortion, and low visibility—rendering existing imitation learning methods insufficiently robust. This paper proposes the first bilateral teleoperation imitation learning framework explicitly designed for underwater environments, marking the first successful adaptation of terrestrial bilateral imitation learning to underwater settings. Its core innovation is a label-free, three-tier illumination-adaptive mechanism: (i) illumination-aware encoding via an illumination encoder; (ii) illumination-conditioned feature modulation using Feature-wise Linear Modulation (FiLM); and (iii) explicit illumination-guided visual representation learning via a Transformer architecture. Evaluated on real-world underwater grasping tasks, the method achieves substantial performance gains over baselines under both static and dynamic illumination conditions. Ablation studies confirm the necessity and synergistic contributions of each component.
📝 Abstract
Underwater robotic manipulation is fundamentally challenged by extreme lighting variations, color distortion, and reduced visibility. We introduce Bi-AQUA, the first underwater bilateral control-based imitation learning framework that integrates lighting-aware visual processing for underwater robot arms. Bi-AQUA employs a hierarchical three-level lighting adaptation mechanism: a Lighting Encoder that extracts lighting representations from RGB images without manual annotation and is implicitly supervised by the imitation objective, FiLM modulation of visual backbone features for adaptive, lighting-aware feature extraction, and an explicit lighting token added to the transformer encoder input for task-aware conditioning. Experiments on a real-world underwater pick-and-place task under diverse static and dynamic lighting conditions show that Bi-AQUA achieves robust performance and substantially outperforms a bilateral baseline without lighting modeling. Ablation studies further confirm that all three lighting-aware components are critical. This work bridges terrestrial bilateral control-based imitation learning and underwater manipulation, enabling force-sensitive autonomous operation in challenging marine environments. For additional material, please check: https://mertcookimg.github.io/bi-aqua