🤖 AI Summary
Generating safe, feasible, and multimodal cooperative behaviors for robot swarms remains challenging due to complex motion constraints, dynamic obstacle avoidance, and multimodal trajectory distributions.
Method: This paper proposes a joint framework integrating generative modeling with a differentiable safety filter. A hybrid conditional variational autoencoder (CVAE) and vector-quantized VAE (VQ-VAE) models the multimodal trajectory distribution, while a differentiable safety filter enforces kinodynamic and collision-avoidance constraints. We further introduce a self-supervised neural initialization network to accelerate optimization convergence.
Contribution/Results: The system generates diverse, dynamically feasible, and collision-free swarm trajectories within tens of milliseconds. The differentiable safety filter achieves one-order-of-magnitude speedup over conventional heuristic methods. To our knowledge, this is the first work enabling millisecond-scale, end-to-end, differentiable multimodal safe trajectory generation—overcoming the long-standing trade-off between real-time performance and constraint feasibility in swarm coordination.
📝 Abstract
Coordination behavior in robot swarms is inherently multi-modal in nature. That is, there are numerous ways in which a swarm of robots can avoid inter-agent collisions and reach their respective goals. However, the problem of generating diverse and feasible swarm behaviors in a scalable manner remains largely unaddressed. In this paper, we fill this gap by combining generative models with a safety-filter (SF). Specifically, we sample diverse trajectories from a learned generative model which is subsequently projected onto the feasible set using the SF. We experiment with two choices for generative models, namely: Conditional Variational Autoencoder (CVAE) and Vector-Quantized Variational Autoencoder (VQ-VAE). We highlight the trade-offs these two models provide in terms of computation time and trajectory diversity. We develop a custom solver for our SF and equip it with a neural network that predicts context-specific initialization. Thecinitialization network is trained in a self-supervised manner, taking advantage of the differentiability of the SF solver. We provide two sets of empirical results. First, we demonstrate that we can generate a large set of multi-modal, feasible trajectories, simulating diverse swarm behaviors, within a few tens of milliseconds. Second, we show that our initialization network provides faster convergence of our SF solver vis-a-vis other alternative heuristics.