🤖 AI Summary
This work addresses the challenge of frequent system crashes during preference-based Bayesian optimization (PBO) in hardware systems such as quadrotor control, where crashes interrupt experiments and conventional methods fail to leverage crash information. The authors propose CrashPBO, the first framework to explicitly incorporate crash feedback into PBO by introducing a crash indicator variable that jointly models user preferences and crash events within a unified surrogate model. This enables proactive avoidance of unsafe regions in the parameter space. Evaluated on synthetic benchmarks, CrashPBO reduces crash occurrences by 63% and improves data efficiency. Its effectiveness, generalizability, and practicality are further demonstrated across three robotic platforms, significantly enhancing both safety and efficiency in human-in-the-loop hyperparameter tuning.
📝 Abstract
Bayesian optimization is a popular black-box optimization method for parameter learning in control and robotics. It typically requires an objective function that reflects the user's optimization goal. However, in practical applications, this objective function is often inaccessible due to complex or unmeasurable performance metrics. Preferential Bayesian optimization (PBO) overcomes this limitation by leveraging human feedback through pairwise comparisons, eliminating the need for explicit performance quantification. When applying PBO to hardware systems, such as in quadcopter control, crashes can cause time-consuming experimental resets, wear and tear, or otherwise undesired outcomes. Standard PBO methods cannot incorporate feedback from such crashed experiments, resulting in the exploration of parameters that frequently lead to experimental crashes. We thus introduce CrashPBO, a user-friendly mechanism that enables users to both express preferences and report crashes during the optimization process. Benchmarking on synthetic functions shows that this mechanism reduces crashes by 63% and increases data efficiency. Through experiments on three robotics platforms, we demonstrate the wide applicability and transferability of CrashPBO, highlighting that it provides a flexible, user-friendly framework for parameter learning with human feedback on preferences and crashes.