π€ AI Summary
This work addresses the poor robustness and limited generalization of existing learning-based control methods in complex urban sidewalk environments, which often suffer from error accumulation in offline imitation learning. To overcome these limitations, we propose a multi-scale hierarchical imitation learning framework that explicitly models both short-term interactive behaviors and long-term goal-directed intentions. Our approach integrates corrective behavior data augmentation with time-span-based trajectory clustering and enhances training data diversity through teleoperation and sensor augmentation. This design significantly improves the policyβs ability to recover from errors, demonstrating superior stability and generalization compared to current baselines in real-world, diverse sidewalk scenarios.
π Abstract
Sidewalk micromobility is a promising solution for last-mile transportation, but current learning-based control methods struggle in complex urban environments. Imitation learning (IL) learns policies from human demonstrations, yet its reliance on fixed offline data often leads to compounding errors, limited robustness, and poor generalization. To address these challenges, we propose a framework that advances IL through corrective behavior expansion and multi-scale imitation learning. On the data side, we augment teleoperation datasets with diverse corrective behaviors and sensor augmentations to enable the policy to learn to recover from its own mistakes. On the model side, we introduce a multi-scale IL architecture that captures both short-horizon interactive behaviors and long-horizon goal-directed intentions via horizon-based trajectory clustering and hierarchical supervision. Real-world experiments show that our approach significantly improves robustness and generalization in diverse sidewalk scenarios.