Adapt On-the-Go: Behavior Modulation for Single-Life Robot Deployment

📅 2023-11-02
🏛️ arXiv.org
📈 Citations: 8
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of unsupervised, real-time adaptation when robots encounter unseen environments during deployment, this paper proposes ROAM: a perception-value-driven framework for online behavior selection and dynamic modulation. ROAM operates without human intervention, autonomously selecting and modulating pretrained behaviors from a library in a single execution to achieve end-to-end, lifetime-limited online adaptation. Its core components include a behavior value estimation module, a meta-level behavior modulation network, and an online policy reweighting algorithm. Evaluated in simulation and on a real-world Go1 quadrupedal robot, ROAM successfully handles strong out-of-distribution tasks—such as navigating while wearing roller skates—with adaptation efficiency over twice that of prior methods. It is the first approach to enable unsupervised, composable, and real-time deployment driven by a reinforcement learning–based behavior library.
📝 Abstract
To succeed in the real world, robots must cope with situations that differ from those seen during training. We study the problem of adapting on-the-fly to such novel scenarios during deployment, by drawing upon a diverse repertoire of previouslylearned behaviors. Our approach, RObust Autonomous Modulation (ROAM), introduces a mechanism based on the perceived value of pre-trained behaviors to select and adapt pre-trained behaviors to the situation at hand. Crucially, this adaptation process all happens within a single episode at test time, without any human supervision. We demonstrate that ROAM enables a robot to adapt rapidly to changes in dynamics both in simulation and on a real Go1 quadruped, even successfully moving forward with roller skates on its feet. Our approach adapts over 2x as efficiently compared to existing methods when facing a variety of out-of-distribution situations during deployment by effectively choosing and adapting relevant behaviors on-the-fly.
Problem

Research questions and friction points this paper is trying to address.

Adapting robots to novel scenarios during deployment
Selecting and adapting pre-trained behaviors autonomously
Improving efficiency in out-of-distribution situations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts pre-trained behaviors on-the-fly
Uses perceived value for behavior selection
Operates without human supervision