ReSteer: Quantifying and Refining the Steerability of Multitask Robot Policies

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that multitask robotic policies often struggle to respond to new instructions during execution due to poor steerability. To this end, we propose the ReSteer framework, which introduces—for the first time—a quantitative steerability metric based on trajectory distribution overlap and devises a method to identify low-steerability states without requiring full policy rollouts. ReSteer employs a closed-loop self-optimization pipeline comprising a steerability estimator, a steerability-aware data generator, and a policy self-refinement training procedure, substantially enhancing real-time responsiveness to user commands. Evaluated in the LIBERO simulation environment, our approach achieves an 11% improvement in steerability with only 18k rollouts. Real-robot experiments further demonstrate that the system can effectively incorporate new user instructions at arbitrary execution times.

Technology Category

Application Category

📝 Abstract
Despite strong multi-task pretraining, existing policies often exhibit poor task steerability. For example, a robot may fail to respond to a new instruction ``put the bowl in the sink" when moving towards the oven, executing ``close the oven", even though it can complete both tasks when executed separately. We propose ReSteer, a framework to quantify and improve task steerability in multitask robot policies. We conduct an exhaustive evaluation of state-of-the-art policies, revealing a common lack of steerability. We find that steerability is associated with limited overlap among training task trajectory distributions, and introduce a proxy metric to measure this overlap from policy behavior. Building on this insight, ReSteer improves steerability via three components: (i) a steerability estimator that identifies low-steerability states without full-rollout evaluation, (ii) a steerable data generator that synthesizes motion segments from these states, and (iii) a self-refinement pipeline that improves policy steerability using the generated data. In simulation on LIBERO, ReSteer improves steerability by 11\% over 18k rollouts. In real-world experiments, we show that improved steerability is critical for interactive use, enabling users to instruct robots to perform any task at any time. We hope this work motivates further study on quantifying steerability and data collection strategies for large robot policies.
Problem

Research questions and friction points this paper is trying to address.

steerability
multitask robot policies
task switching
interactive robotics
policy refinement
Innovation

Methods, ideas, or system contributions that make the work stand out.

steerability
multitask robot policies
trajectory distribution overlap
self-refinement
steerable data generation
Zhenyang Chen
Zhenyang Chen
Georgia Institute of Technology
RoboticsMachine Learning
A
Alan Tian
Georgia Institute of Technology
L
Liquan Wang
Georgia Institute of Technology
B
Benjamin Joffe
Georgia Institute of Technology
Y
Yingyan Celine Lin
Georgia Institute of Technology
Y
Yuxiao Chen
Georgia Institute of Technology
Siddharth Karamcheti
Siddharth Karamcheti
PhD Student - Stanford University
RoboticsHuman-Robot InteractionNatural Language ProcessingMachine Learning
Danfei Xu
Danfei Xu
Assistant Professor at School of Interactive Computing
Robot LearningComputer Vision