Bipedalism for Quadrupedal Robots: Versatile Loco-Manipulation through Risk-Adaptive Reinforcement Learning

📅 2025-07-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of degraded bipedal locomotion stability in quadrupedal robots when the forelimbs perform environmental interactions—such as pushing carts, obstacle probing, or load transportation. We propose a risk-adaptive distributional reinforcement learning framework that dynamically modulates policy conservatism based on the coefficient of variation of the return distribution, significantly outperforming baseline methods in simulation. Leveraging sim-to-real transfer, the method is successfully deployed on the Unitree Go2 platform. To our knowledge, this is the first approach enabling stable, robust multi-task loco-manipulation: it maintains bipedal gait stability while freeing the forelimbs for diverse manipulation tasks, effectively balancing safety and operational flexibility. Crucially, the method requires no additional mechanical design or explicit motion planning modules, reducing system complexity. It establishes a scalable, end-to-end decision-making paradigm for embodied intelligence in quadrupedal robots.

Technology Category

Application Category

📝 Abstract
Loco-manipulation of quadrupedal robots has broadened robotic applications, but using legs as manipulators often compromises locomotion, while mounting arms complicates the system. To mitigate this issue, we introduce bipedalism for quadrupedal robots, thus freeing the front legs for versatile interactions with the environment. We propose a risk-adaptive distributional Reinforcement Learning (RL) framework designed for quadrupedal robots walking on their hind legs, balancing worst-case conservativeness with optimal performance in this inherently unstable task. During training, the adaptive risk preference is dynamically adjusted based on the uncertainty of the return, measured by the coefficient of variation of the estimated return distribution. Extensive experiments in simulation show our method's superior performance over baselines. Real-world deployment on a Unitree Go2 robot further demonstrates the versatility of our policy, enabling tasks like cart pushing, obstacle probing, and payload transport, while showcasing robustness against challenging dynamics and external disturbances.
Problem

Research questions and friction points this paper is trying to address.

Enabling quadrupedal robots to use front legs for versatile interactions
Balancing locomotion and manipulation in inherently unstable bipedal walking
Adapting risk preferences based on return uncertainty in reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bipedalism enables quadrupedal robots' versatile loco-manipulation
Risk-adaptive RL balances performance and stability
Dynamic risk adjustment based on return uncertainty
🔎 Similar Papers
No similar papers found.