🤖 AI Summary
This work addresses the challenge of enabling robots to autonomously manipulate diverse articulated objects in open-world environments. It proposes the first integration of Quality-Diversity (QD) algorithms with sparse-reward reinforcement learning to automatically generate a rich repertoire of high-performance, low-level manipulation motion primitives that are functionally equivalent yet morphologically diverse. This approach effectively handles dynamic constraints and uncertainties inherent in real-world scenarios. Evaluated on 30 categories of articulated structures from the PartNetMobility dataset, the method produces an average of 704 valid trajectories per category, achieving over fivefold higher diversity compared to baseline approaches. The generated primitives were successfully deployed on a physical robot system, significantly enhancing the adaptability and generalization capabilities of the learned policies.
📝 Abstract
Thanks to the latest advances in learning and robotics, domestic robots are beginning to enter homes, aiming to execute household chores autonomously. However, robots still struggle to perform autonomous manipulation tasks in open-ended environments. In this context, this paper presents a method that enables a robot to manipulate a wide spectrum of articulated objects.
In this paper, we automatically generate different robot low-level trajectory primitives to manipulate given object articulations. A very important point when it comes to generating expert trajectories is to consider the diversity of solutions to achieve the same goal. Indeed, knowing diverse low-level primitives to accomplish the same task enables the robot to choose the optimal solution in its real-world environment, with live constraints and unexpected changes. To do so, we propose a method based on Quality-Diversity algorithms that leverages sparse reward exploration in order to generate a set of diverse and high-performing trajectory primitives for a given manipulation task.
We validated our method, QDTraj, by generating diverse trajectories in simulation and deploying them in the real world. QDTraj generates at least 5 times more diverse trajectories for both hinge and slider activation tasks, outperforming the other methods we compared against. We assessed the generalization of our method over 30 articulations of the PartNetMobility articulated object dataset, with an average of 704 different trajectories by task. Code is publicly available at: https://kappel.web.isir.upmc.fr/trajectory_primitive_website