Learning to Balance Motor Thermal Safety and Quadrupedal Locomotion Performance with Residual Policy

📅 2026-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited endurance of quadrupedal robots under payload due to motor overheating, a challenge largely overlooked by existing approaches lacking integrated thermal management. The study proposes a thermal-aware residual policy architecture by embedding a whole-body motor thermal model into a reinforcement learning framework. It first pre-trains a nominal policy to acquire baseline locomotion skills and then employs a thermal-state-driven residual policy to dynamically adjust actions online. This two-stage approach effectively mitigates overheating while preserving locomotion performance. Simulations and real-world experiments demonstrate that a Unitree A1 robot equipped with the proposed strategy sustains stable operation for over 13 minutes under a 3 kg payload, significantly outperforming the nominal policy alone—which leads to overheating within approximately 5 minutes—thereby enhancing both thermal safety and mission endurance.
📝 Abstract
Motor thermal management is often overlooked in the context of electrically-actuated robots, particularly legged robots, but motor overheating is a key factor that limits long-duration locomotion especially under payload conditions. This paper integrates a whole-body thermal model of a quadruped robot into the reinforcement learning pipeline to update motor temperatures, and proposes a two-stage training framework for motor thermal management. In this framework, a nominal policy is first pre-trained as a locomotion baseline capable of traversing diverse terrains. A residual policy is then trained on top of the nominal policy to provide corrective actions based on the robot's thermal state, ensuring high performance under low-temperature conditions and preventing motor overheating under high-temperature conditions. Simulation results demonstrate that the proposed policy achieves an effective balance between motor thermal safety and locomotion performance. Real-world experiments on a Unitree A1 quadruped robot further validate the approach: under a 3 kg payload, the robot achieves stable locomotion across multiple terrains for over 13 minutes, while the nominal policy alone leads to motor overheating in about 5 minutes.
Problem

Research questions and friction points this paper is trying to address.

motor thermal management
quadrupedal locomotion
motor overheating
thermal safety
payload conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

thermal-aware reinforcement learning
residual policy
motor thermal management
quadrupedal locomotion
whole-body thermal model
Y
Yuhang Wan
School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
W
Weixian Lin
School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
L
Letian Qian
School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
Y
Yiqi Zou
School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
Weiwei Wu
Weiwei Wu
Computer Science, Southeast University
S
Shengwei Wu
School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
C
Chuanlin Zhao
School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China
Xin Luo
Xin Luo
University of Science and Technology of China
Computer Vision