Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy

📅 2026-05-26

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the limited endurance of quadrupedal robots under payload due to motor overheating, a challenge largely overlooked by existing approaches lacking integrated thermal management. The study proposes a thermal-aware residual policy architecture by embedding a whole-body motor thermal model into a reinforcement learning framework. It first pre-trains a nominal policy to acquire baseline locomotion skills and then employs a thermal-state-driven residual policy to dynamically adjust actions online. This two-stage approach effectively mitigates overheating while preserving locomotion performance. Simulations and real-world experiments demonstrate that a Unitree A1 robot equipped with the proposed strategy sustains stable operation for over 13 minutes under a 3 kg payload, significantly outperforming the nominal policy alone—which leads to overheating within approximately 5 minutes—thereby enhancing both thermal safety and mission endurance.

📝 Abstract

Motor thermal management is often overlooked in the context of electrically-actuated robots, particularly legged robots, but motor overheating is a key factor that limits long-duration locomotion especially under payload conditions. This paper integrates a whole-body thermal model of a quadruped robot into the reinforcement learning pipeline to update motor temperatures, and proposes a two-stage training framework for motor thermal management. In this framework, a nominal policy is first pre-trained as a locomotion baseline capable of traversing diverse terrains. A residual policy is then trained on top of the nominal policy to provide corrective actions based on the robot's thermal state, ensuring high performance under low-temperature conditions and preventing motor overheating under high-temperature conditions. Simulation results demonstrate that the proposed policy achieves an effective balance between motor thermal safety and locomotion performance. Real-world experiments on a Unitree A1 quadruped robot further validate the approach: under a 3 kg payload, the robot achieves stable locomotion across multiple terrains for over 13 minutes, while the nominal policy alone leads to motor overheating in about 5 minutes.

Problem

Research questions and friction points this paper is trying to address.

motor thermal management

quadrupedal locomotion

motor overheating

thermal safety

payload conditions

Innovation

Methods, ideas, or system contributions that make the work stand out.

thermal-aware reinforcement learning

residual policy

motor thermal management