Sampling-Based System Identification with Active Exploration for Legged Robot Sim2Real Learning

📅 2025-05-20

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

To address the failure of high-precision locomotion tasks in legged robot sim-to-real transfer caused by model mismatch, this paper proposes SPI-Active: a two-stage, sampling-driven system identification and active exploration framework. Designed specifically for contact-rich legged systems, SPI-Active requires neither differentiable dynamics nor direct torque sensing. In the first stage, it collects diverse, multimodal motion data via large-scale parallel trajectory sampling. In the second stage, it performs active exploration guided by Fisher information maximization to enhance parameter sensitivity and data informativeness. Finally, it estimates critical physical parameters via black-box optimization driven by state prediction error. Evaluated across multiple gait tasks, SPI-Active achieves high-fidelity sim-to-real transfer of learned policies, improving performance over baselines by 42–63% and significantly enhancing cross-task generalization capability.

Technology Category

Application Category

📝 Abstract

Sim-to-real discrepancies hinder learning-based policies from achieving high-precision tasks in the real world. While Domain Randomization (DR) is commonly used to bridge this gap, it often relies on heuristics and can lead to overly conservative policies with degrading performance when not properly tuned. System Identification (Sys-ID) offers a targeted approach, but standard techniques rely on differentiable dynamics and/or direct torque measurement, assumptions that rarely hold for contact-rich legged systems. To this end, we present SPI-Active (Sampling-based Parameter Identification with Active Exploration), a two-stage framework that estimates physical parameters of legged robots to minimize the sim-to-real gap. SPI-Active robustly identifies key physical parameters through massive parallel sampling, minimizing state prediction errors between simulated and real-world trajectories. To further improve the informativeness of collected data, we introduce an active exploration strategy that maximizes the Fisher Information of the collected real-world trajectories via optimizing the input commands of an exploration policy. This targeted exploration leads to accurate identification and better generalization across diverse tasks. Experiments demonstrate that SPI-Active enables precise sim-to-real transfer of learned policies to the real world, outperforming baselines by 42-63% in various locomotion tasks.

Problem

Research questions and friction points this paper is trying to address.

Minimizing sim-to-real gap for legged robots

Identifying key physical parameters robustly

Improving data informativeness via active exploration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Massive parallel sampling for parameter identification

Active exploration to maximize Fisher Information

Two-stage framework minimizing sim-to-real gap

🔎 Similar Papers

No similar papers found.