🤖 AI Summary
Co-design of robot morphology and neural control faces bottlenecks in high trial-and-error costs of reinforcement learning and the inability to efficiently evaluate non-differentiable physical modifications (e.g., adding/removing components).
Method: We propose “morphology pretraining”: (1) a morphology-agnostic universal controller is pre-trained via gradient-based optimization in differentiable physics simulation; (2) zero-shot evolutionary search enables instantaneous performance evaluation of structural changes; and (3) a population-level online fine-tuning loop preserves morphological diversity and prevents collapse.
Contribution/Results: Our method significantly enhances morphological diversity and locomotion performance without extensive policy retraining. It accelerates convergence by multiple-fold compared to conventional co-optimization approaches, establishing a new paradigm for efficient, scalable morphology–control co-design.
📝 Abstract
The co-design of robot morphology and neural control typically requires using reinforcement learning to approximate a unique control policy gradient for each body plan, demanding massive amounts of training data to measure the performance of each design. Here we show that a universal, morphology-agnostic controller can be rapidly and directly obtained by gradient-based optimization through differentiable simulation. This process of morphological pretraining allows the designer to explore non-differentiable changes to a robot's physical layout (e.g. adding, removing and recombining discrete body parts) and immediately determine which revisions are beneficial and which are deleterious using the pretrained model. We term this process"zero-shot evolution"and compare it with the simultaneous co-optimization of a universal controller alongside an evolving design population. We find the latter results in diversity collapse, a previously unknown pathology whereby the population -- and thus the controller's training data -- converges to similar designs that are easier to steer with a shared universal controller. We show that zero-shot evolution with a pretrained controller quickly yields a diversity of highly performant designs, and by fine-tuning the pretrained controller on the current population throughout evolution, diversity is not only preserved but significantly increased as superior performance is achieved.