GundamQ: Multi-Scale Spatio-Temporal Representation Learning for Robust Robot Path Planning

📅 2025-09-12
📈 Citations: 0
Influential: 0
📄 PDF

career value

208K/year
🤖 AI Summary
To address two key challenges in robot path planning under dynamic and uncertain environments—insufficient multi-scale temporal dependency modeling and poor exploration-exploitation trade-off—this paper proposes a dual-module framework integrating multi-granularity spatiotemporal awareness and adaptive policy optimization. The method employs hierarchical feature extraction and multi-scale temporal modeling to enhance prediction and reactive capability toward partially observable dynamic obstacles; it further introduces a constrained policy optimization mechanism to ensure smooth, low-collision policy updates. Within a deep reinforcement learning framework, the approach unifies spatiotemporal representation learning with decision robustness. Experimental results demonstrate that, compared to state-of-the-art methods, our approach achieves a 15.3% improvement in task success rate and a 21.7% enhancement in path quality, significantly boosting adaptability to dynamic scenarios and decision robustness.

Technology Category

Application Category

📝 Abstract
In dynamic and uncertain environments, robotic path planning demands accurate spatiotemporal environment understanding combined with robust decision-making under partial observability. However, current deep reinforcement learning-based path planning methods face two fundamental limitations: (1) insufficient modeling of multi-scale temporal dependencies, resulting in suboptimal adaptability in dynamic scenarios, and (2) inefficient exploration-exploitation balance, leading to degraded path quality. To address these challenges, we propose GundamQ: A Multi-Scale Spatiotemporal Q-Network for Robotic Path Planning. The framework comprises two key modules: (i) the Spatiotemporal Perception module, which hierarchically extracts multi-granularity spatial features and multi-scale temporal dependencies ranging from instantaneous to extended time horizons, thereby improving perception accuracy in dynamic environments; and (ii) the Adaptive Policy Optimization module, which balances exploration and exploitation during training while optimizing for smoothness and collision probability through constrained policy updates. Experiments in dynamic environments demonstrate that GundamQ achieves a 15.3% improvement in success rate and a 21.7% increase in overall path quality, significantly outperforming existing state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Improving robot path planning in dynamic uncertain environments
Addressing insufficient multi-scale temporal dependency modeling
Solving inefficient exploration-exploitation balance in planning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical multi-scale spatiotemporal feature extraction
Adaptive exploration-exploitation policy optimization
Constrained updates for smooth collision-free paths