🤖 AI Summary
This work addresses the limited generalization of pretrained generative policies in dexterous manipulation tasks, which often struggle to balance global exploration with local error correction. To overcome this challenge, the authors propose the Residual Flow Steering (RFS) framework, which enables data-efficient reinforcement learning fine-tuning of flow-matching-based pretrained policies by jointly optimizing residual actions and the latent noise distribution. RFS integrates local residual refinement with global exploration through latent space modulation, thereby preserving the expressive capacity of the original policy while enabling rapid adaptation. Experimental results demonstrate that RFS significantly enhances policy robustness and adaptability across both simulated and real-world dexterous manipulation tasks.
📝 Abstract
Imitation learning has emerged as an effective approach for bootstrapping sequential decision-making in robotics, achieving strong performance even in high-dimensional dexterous manipulation tasks. Recent behavior cloning methods further leverage expressive generative models, such as diffusion models and flow matching, to represent multimodal action distributions. However, policies pretrained in this manner often exhibit limited generalization and require additional fine-tuning to achieve robust performance at deployment time. Such adaptation must preserve the global exploration benefits of pretraining while enabling rapid correction of local execution errors. We propose Residual Flow Steering(RFS), a data-efficient reinforcement learning framework for adapting pretrained generative policies. RFS steers a pretrained flow-matching policy by jointly optimizing a residual action and a latent noise distribution, enabling complementary forms of exploration: local refinement through residual corrections and global exploration through latent-space modulation. This design allows efficient adaptation while retaining the expressive structure of the pretrained policy. We demonstrate the effectiveness of RFS on dexterous manipulation tasks, showing efficient fine-tuning in both simulation and real-world settings when adapting pretrained base policies. Project website:https://weirdlabuw.github.io/rfs.