Gameplay Filters: Robust Zero-Shot Safety through Adversarial Imagination

📅 2024-05-01

📈 Citations: 2

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Legged robots face zero-shot safety control challenges under unknown disturbances, where existing safety filters lack robustness amid complex dynamics and environmental variations. Method: We propose Gameplay Filter—a predictive safety filter based on adversarial imagination—that jointly trains a safety policy and a virtual adversary to proactively simulate worst-case disturbance scenarios in simulation, thereby real-time filtering out potentially unsafe actions. Contributions/Results: It is the first end-to-end safety filter operating directly on full-order (36-dimensional) quadrupedal dynamics. Its adversarial co-training mechanism explicitly amplifies sim-to-real discrepancies and extreme disturbances, enabling zero-shot deployment without fine-tuning. Evaluated on two quadruped platforms, Gameplay Filter successfully withstands strong external perturbations—including forceful dragging and unmodeled terrain—achieving significantly higher zero-shot safety success rates than local-model-based safety filters. Code and real-robot videos are publicly available.

Technology Category

Application Category

📝 Abstract

Despite the impressive recent advances in learning-based robot control, ensuring robustness to out-of-distribution conditions remains an open challenge. Safety filters can, in principle, keep arbitrary control policies from incurring catastrophic failures by overriding unsafe actions, but existing solutions for complex (e.g., legged) robot dynamics do not span the full motion envelope and instead rely on local, reduced-order models. These filters tend to overly restrict agility and can still fail when perturbed away from nominal conditions. This paper presents the gameplay filter, a new class of predictive safety filter that continually plays out hypothetical matches between its simulation-trained safety strategy and a virtual adversary co-trained to invoke worst-case events and sim-to-real error, and precludes actions that would cause failures down the line. We demonstrate the scalability and robustness of the approach with a first-of-its-kind full-order safety filter for (36-D) quadrupedal dynamics. Physical experiments on two different quadruped platforms demonstrate the superior zero-shot effectiveness of the gameplay filter under large perturbations such as tugging and unmodeled terrain. Experiment videos and open-source software are available online: https://saferobotics.org/research/gameplay-filter

Problem

Research questions and friction points this paper is trying to address.

Robot Learning

Stability

Adaptive Control

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gameplay Filter

Predictive Failure Prevention

Quadruped Robot Stability

🔎 Similar Papers

Safe Reinforcement Learning in Black-Box Environments via Adaptive Shielding