🤖 AI Summary
This work addresses the challenge of zero-shot generalization to unseen long-horizon goal-directed tasks when only a fixed demonstration dataset is available. To this end, the authors propose ZALT, a novel method that abstracts raw trajectories into composable hub-based topological structures. By identifying latent hub states, ZALT learns high-level policies and dynamics models between hubs and performs planning over this abstract topology to enable zero-shot adaptation to new tasks. Evaluated in complex 3D maze environments, ZALT achieves a 55% zero-shot success rate on unseen tasks—substantially outperforming the strongest baseline, which attains only 6%. This significant improvement demonstrates that ZALT greatly enhances both the reliability and performance of zero-shot imitation learning in long-horizon settings.
📝 Abstract
Imitation learning is effective for training agents when expert demonstrations are available, but collecting demonstrations for every complex task in an environment is costly. We study the long-horizon, goal-conditioned setting where a fixed demonstration dataset contains useful behavior, but not complete examples for every task the agent must solve. Existing imitation learning methods can learn strong policies from demonstrations, but when solving long-horizon tasks, small errors accumulate over long primitive-action trajectories and make zero-shot adaptation to new tasks unreliable. We introduce Zero-shot Agents from Latent Topologies (ZALT), an imitation-learning method that solves unseen start-goal tasks beyond those demonstrated during training. ZALT identifies latent hub states where trajectories converge or diverge, learns policies and a dynamics model over hub-to-hub transitions, and plans over the hub topology to complete new tasks. This topology makes demonstrated behaviors explicitly composable while compressing long tasks into shorter sequences of abstract transitions -- combined, these enable ZALT to perform zero-shot adaptation. In a complex 3D maze environment, ZALT achieves 55% zero-shot success on unseen tasks, compared to 6% for the strongest baseline.