Predictive Red Teaming: Breaking Policies Without Breaking Robots

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

To address the challenge of low-cost evaluation of visual-motor policy robustness against environmental perturbations—such as illumination changes, occlusions, and object pose variations—in real-world settings, this paper introduces a novel *predictive red-teaming* paradigm. Methodologically, we propose RoboART, an automated framework that integrates diffusion-based controllable image editing with policy-specific anomaly detectors to generate failure-inducing scenarios, identify sensitivity factors, and quantitatively predict performance degradation—all without hardware intervention. We formally define the predictive red-teaming task for the first time and introduce a visuomotor diffusion strategy evaluation mechanism. Extensive validation across 12 anomaly categories and over 500 physical robot trials yields a prediction error < 0.19. Furthermore, data collection guided by our method improves policy performance by 2–7×.

Technology Category

Application Category

📝 Abstract

Visuomotor policies trained via imitation learning are capable of performing challenging manipulation tasks, but are often extremely brittle to lighting, visual distractors, and object locations. These vulnerabilities can depend unpredictably on the specifics of training, and are challenging to expose without time-consuming and expensive hardware evaluations. We propose the problem of predictive red teaming: discovering vulnerabilities of a policy with respect to environmental factors, and predicting the corresponding performance degradation without hardware evaluations in off-nominal scenarios. In order to achieve this, we develop RoboART: an automated red teaming (ART) pipeline that (1) modifies nominal observations using generative image editing to vary different environmental factors, and (2) predicts performance under each variation using a policy-specific anomaly detector executed on edited observations. Experiments across 500+ hardware trials in twelve off-nominal conditions for visuomotor diffusion policies demonstrate that RoboART predicts performance degradation with high accuracy (less than 0.19 average difference between predicted and real success rates). We also demonstrate how predictive red teaming enables targeted data collection: fine-tuning with data collected under conditions predicted to be adverse boosts baseline performance by 2-7x.

Problem

Research questions and friction points this paper is trying to address.

Identifying policy vulnerabilities

Predicting performance degradation

Avoiding hardware evaluations

Innovation

Methods, ideas, or system contributions that make the work stand out.

generative image editing

policy-specific anomaly detector

automated red teaming pipeline

🔎 Similar Papers

No similar papers found.