🤖 AI Summary
This work addresses a critical limitation in existing robot preference learning methods, which often neglect the user’s perceptual experience during feedback, thereby struggling to balance learning efficiency with interaction friendliness. To overcome this, the paper introduces CMA-ES-IG, a novel algorithm that integrates user experience directly into the query generation mechanism by combining Covariance Matrix Adaptation Evolution Strategy (CMA-ES) with an information gain criterion. This approach actively generates trajectories in high-dimensional behavior spaces that are both perceptually distinct and highly informative for user ranking. Experimental results demonstrate that CMA-ES-IG significantly outperforms current methods on both simulated and real robotic platforms, achieving superior scalability, computational efficiency, robustness to noise, and alignment with user preferences—enabling efficient, robust, and user-friendly preference elicitation.
📝 Abstract
Robots that interact with humans must adapt to individual users' preferences to operate effectively in human-centered environments. An intuitive and effective technique to learn non-expert users' preferences is through rankings of robot behaviors, e.g., trajectories, gestures, or voices. Existing techniques primarily focus on generating queries that optimize preference learning outcomes, such as sample efficiency or final preference estimation accuracy. However, the focus on outcome overlooks key user expectations in the process of providing these rankings, which can negatively impact users' adoption of robotic systems. This work proposes the Covariance Matrix Adaptation Evolution Strategies with Information Gain (CMA-ES-IG) algorithm. CMA-ES-IG explicitly incorporates user experience considerations into the preference learning process by suggesting perceptually distinct and informative trajectories for users to rank. We demonstrate these benefits through both simulated studies and real-robot experiments. CMA-ES-IG, compared to state-of-the-art alternatives, (1) scales more effectively to higher-dimensional preference spaces, (2) maintains computational tractability for high-dimensional problems, (3) is robust to noisy or inconsistent user feedback, and (4) is preferred by non-expert users in identifying their preferred robot behaviors. This project's code is available at github.com/interaction-lab/CMA-ES-IG