🤖 AI Summary
This work addresses the challenges of low sample efficiency and poor spatial generalization in mobile manipulation tasks by proposing a novel approach that integrates Adaptive Experience Selection (AES) with forward planning based on a Recurrent State Space Model (RSSM). The AES mechanism prioritizes experience segments critical to task success, thereby mitigating skill forgetting, while the RSSM effectively captures the coupled dynamics between the mobile base and manipulator, enabling zero-shot generalization to new spatial layouts without retraining. By combining multi-stage heterogeneous skill integration with model-predictive forward planning, the proposed method significantly outperforms existing approaches across diverse experimental settings. Real-world robot experiments further demonstrate its effectiveness and practicality in novel environments.
📝 Abstract
Mobile Manipulation (MM) involves long-horizon decision-making over multi-stage compositions of heterogeneous skills, such as navigation and picking up objects. Despite recent progress, existing MM methods still face two key limitations: (i) low sample efficiency, due to ineffective use of redundant data generated during long-term MM interactions; and (ii) poor spatial generalization, as policies trained on specific tasks struggle to transfer to new spatial layouts without additional training. In this paper, we address these challenges through Adaptive Experience Selection (AES) and model-based dynamic imagination. In particular, AES makes MM agents pay more attention to critical experience fragments in long trajectories that affect task success, improving skill chain learning and mitigating skill forgetting. Based on AES, a Recurrent State-Space Model (RSSM) is introduced for Model-Predictive Forward Planning (MPFP) by capturing the coupled dynamics between the mobile base and the manipulator and imagining the dynamics of future manipulations. RSSM-based MPFP can reinforce MM skill learning on the current task while enabling effective generalization to new spatial layouts. Comparative studies across different experimental configurations demonstrate that our method significantly outperforms existing MM policies. Real-world experiments further validate the feasibility and practicality of our method.