🤖 AI Summary
Industrial quadruped robots face challenges in robustly navigating dynamic environments, and conventional testing methods suffer from low coverage and poor reproducibility.
Method: This paper pioneers the adaptation of Surrealist—a search-based simulation testing framework originally developed for UAVs—to the ANYmal quadruped platform. We propose an automated, closed-loop scenario generation and verification methodology integrating high-fidelity simulation modeling, evolutionary scene mutation strategies, and quantitative success-rate evaluation. The approach enables objective, black-box comparison and systematic defect exposure for proprietary navigation algorithms.
Contribution/Results: In pilot deployment, our framework identified a critical performance bottleneck—40.3% success rate—for one navigation algorithm, while verifying another achieving 71.2%. Within six months, it enabled efficient, repeatable evaluation of five distinct algorithms. The method significantly enhances automation, reproducibility, and rigor in industrial-grade navigation system validation.
📝 Abstract
Ensuring robust robotic navigation in dynamic environments is a key challenge, as traditional testing methods often struggle to cover the full spectrum of operational requirements. This paper presents the industrial adoption of Surrealist, a simulation-based test generation framework originally for UAVs, now applied to the ANYmal quadrupedal robot for industrial inspection. Our method uses a search-based algorithm to automatically generate challenging obstacle avoidance scenarios, uncovering failures often missed by manual testing. In a pilot phase, generated test suites revealed critical weaknesses in one experimental algorithm (40.3% success rate) and served as an effective benchmark to prove the superior robustness of another (71.2% success rate). The framework was then integrated into the ANYbotics workflow for a six-month industrial evaluation, where it was used to test five proprietary algorithms. A formal survey confirmed its value, showing it enhances the development process, uncovers critical failures, provides objective benchmarks, and strengthens the overall verification pipeline.