Learning to Recover: Dynamic Reward Shaping with Wheel-Leg Coordination for Fallen Robots

📅 2025-06-05

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Legged-wheel robots face significant challenges in post-fall adaptive recovery, including policy fragility, oversimplified dynamics modeling, and sparse reward signals. This paper proposes a curriculum-learning-based dynamic reward shaping framework to jointly optimize diverse recovery motion exploration and high-precision pose refinement. We introduce the first episode-level dynamic reward shaping mechanism; design an asymmetric Actor-Critic architecture that incorporates privileged simulation information to enhance policy generalization; and uncover the wheel-leg cooperative energy transfer principle to reduce joint loading. Integrating noise-robust observation modeling with Sim2Real transfer training, our method achieves up to 99.1% recovery success rate on two quadrupedal legged-wheel platforms, reduces peak joint torque by 15.8–26.2%, and eliminates the need for platform-specific hyperparameter tuning.

Technology Category

Application Category

📝 Abstract

Adaptive recovery from fall incidents are essential skills for the practical deployment of wheeled-legged robots, which uniquely combine the agility of legs with the speed of wheels for rapid recovery. However, traditional methods relying on preplanned recovery motions, simplified dynamics or sparse rewards often fail to produce robust recovery policies. This paper presents a learning-based framework integrating Episode-based Dynamic Reward Shaping and curriculum learning, which dynamically balances exploration of diverse recovery maneuvers with precise posture refinement. An asymmetric actor-critic architecture accelerates training by leveraging privileged information in simulation, while noise-injected observations enhance robustness against uncertainties. We further demonstrate that synergistic wheel-leg coordination reduces joint torque consumption by 15.8% and 26.2% and improves stabilization through energy transfer mechanisms. Extensive evaluations on two distinct quadruped platforms achieve recovery success rates up to 99.1% and 97.8% without platform-specific tuning. The supplementary material is available at https://boyuandeng.github.io/L2R-WheelLegCoordination/

Problem

Research questions and friction points this paper is trying to address.

Enabling wheel-legged robots to recover from falls robustly

Overcoming limitations of preplanned motions and sparse rewards

Optimizing wheel-leg coordination for energy-efficient recovery

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Reward Shaping balances exploration and refinement

Asymmetric actor-critic uses privileged simulation information

Wheel-leg coordination reduces torque and improves stabilization

🔎 Similar Papers

No similar papers found.