🤖 AI Summary
This work addresses the limited endurance of quadrupedal robots under payload due to motor overheating, a challenge largely overlooked by existing approaches lacking integrated thermal management. The study proposes a thermal-aware residual policy architecture by embedding a whole-body motor thermal model into a reinforcement learning framework. It first pre-trains a nominal policy to acquire baseline locomotion skills and then employs a thermal-state-driven residual policy to dynamically adjust actions online. This two-stage approach effectively mitigates overheating while preserving locomotion performance. Simulations and real-world experiments demonstrate that a Unitree A1 robot equipped with the proposed strategy sustains stable operation for over 13 minutes under a 3 kg payload, significantly outperforming the nominal policy alone—which leads to overheating within approximately 5 minutes—thereby enhancing both thermal safety and mission endurance.
📝 Abstract
Motor thermal management is often overlooked in the context of electrically-actuated robots, particularly legged robots, but motor overheating is a key factor that limits long-duration locomotion especially under payload conditions. This paper integrates a whole-body thermal model of a quadruped robot into the reinforcement learning pipeline to update motor temperatures, and proposes a two-stage training framework for motor thermal management. In this framework, a nominal policy is first pre-trained as a locomotion baseline capable of traversing diverse terrains. A residual policy is then trained on top of the nominal policy to provide corrective actions based on the robot's thermal state, ensuring high performance under low-temperature conditions and preventing motor overheating under high-temperature conditions. Simulation results demonstrate that the proposed policy achieves an effective balance between motor thermal safety and locomotion performance. Real-world experiments on a Unitree A1 quadruped robot further validate the approach: under a 3 kg payload, the robot achieves stable locomotion across multiple terrains for over 13 minutes, while the nominal policy alone leads to motor overheating in about 5 minutes.