🤖 AI Summary
In cluttered, dynamic environments (e.g., homes and offices), service robots face challenges in semantic mapping—including low efficiency, severe occlusion, and high uncertainty. To address these, this paper proposes an uncertainty-guided active perception and targeted manipulation framework. Methodologically, it integrates evidential deep learning (using Dirichlet/Beta distributions to model semantic and occupancy confidence), Dempster–Shafer theory for uncertainty reasoning, PPO-based reinforcement learning for viewpoint planning, and lightweight pushing control. It introduces, for the first time, a minimally invasive pushing strategy that jointly optimizes viewpoint selection and object displacement to maximize information gain. Experiments demonstrate that, compared to state-of-the-art methods, our approach reduces planning latency by 95%, significantly decreases unintended object displacement and falling, and enables high-accuracy, real-time, low-disturbance online semantic map updating.
📝 Abstract
Service robots operating in cluttered human environments such as homes, offices, and schools cannot rely on predefined object arrangements and must continuously update their semantic and spatial estimates while dealing with possible frequent rearrangements. Efficient and accurate mapping under such conditions demands selecting informative viewpoints and targeted manipulations to reduce occlusions and uncertainty. In this work, we present a manipulation-enhanced semantic mapping framework for occlusion-heavy shelf scenes that integrates evidential metric-semantic mapping with reinforcement-learning-based next-best view planning and targeted action selection. Our method thereby exploits uncertainty estimates from the Dirichlet and Beta distributions in the semantic and occupancy prediction networks to guide both active sensor placement and object manipulation, focusing on areas of limited knowledge and selecting actions with high expected information gain. For object manipulation, we introduce an uncertainty-informed push strategy that targets occlusion-critical objects and generates minimally invasive actions to reveal hidden regions. The experimental evaluation shows that our framework highly reduces object displacement and drops while achieving a 95% reduction in planning time compared to the state-of-the-art, thereby realizing real-world applicability.