π€ AI Summary
This work addresses the inefficiency in training control policies for multirotor aerial vehicles and their limited transferability across different vehicle configurations. To overcome these challenges, the authors propose a physics-aware neural control architecture that integrates a reinforcement learning-based controller with a supervised control allocation network. A novel policy library initialization mechanism, grounded in a similarity metric derived from policy evaluation, enables efficient cross-configuration policy transfer for the first time. Evaluated on quadrotor and hexacopter platforms, the approach reduces the required number of environment interactions by 73.5% on average while achieving state-of-the-art control performance. The methodβs effectiveness is validated through extensive experiments in both simulation and real-world environments.
π Abstract
Efficiently training control policies for robots is a major challenge that can greatly benefit from utilizing knowledge gained from training similar systems through cross-embodiment knowledge transfer. In this work, we focus on accelerating policy training using a library-based initialization scheme that enables effective knowledge transfer across multirotor configurations. By leveraging a physics-aware neural control architecture that combines a reinforcement learning-based controller and a supervised control allocation network, we enable the reuse of previously trained policies. To this end, we utilize a policy evaluation-based similarity measure that identifies suitable policies for initialization from a library. We demonstrate that this measure correlates with the reduction in environment interactions needed to reach target performance and is therefore suited for initialization. Extensive simulation and real-world experiments confirm that our control architecture achieves state-of-the-art control performance, and that our initialization scheme saves on average up to $73.5\%$ of environment interactions (compared to training a policy from scratch) across diverse quadrotor and hexarotor designs, paving the way for efficient cross-embodiment transfer in reinforcement learning.