Goal Reaching with Eikonal-Constrained Hierarchical Quasimetric Reinforcement Learning

📅 2025-12-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address poor generalization and difficulty in maintaining local consistency of goal-conditioned reinforcement learning (GCRL) in complex dynamic environments, this paper proposes an Eikonal-equation-constrained quasimetric value learning framework operating in continuous time. It introduces the Eikonal partial differential equation (PDE) to RL for the first time, enabling trajectory-agnostic continuous-time modeling. We design a hierarchical Eik-HiQRL architecture that decouples long-horizon goal planning from low-level dynamical control. The method integrates Eikonal PDE-constrained optimization, quasimetric neural network representation, and a hierarchical offline training paradigm. Evaluated on offline goal navigation tasks, our approach achieves state-of-the-art (SOTA) performance; on robotic manipulation tasks, it significantly outperforms baseline QRL methods while matching the stability and accuracy of temporal-difference methods.

Technology Category

Application Category

📝 Abstract
Goal-Conditioned Reinforcement Learning (GCRL) mitigates the difficulty of reward design by framing tasks as goal reaching rather than maximizing hand-crafted reward signals. In this setting, the optimal goal-conditioned value function naturally forms a quasimetric, motivating Quasimetric RL (QRL), which constrains value learning to quasimetric mappings and enforces local consistency through discrete, trajectory-based constraints. We propose Eikonal-Constrained Quasimetric RL (Eik-QRL), a continuous-time reformulation of QRL based on the Eikonal Partial Differential Equation (PDE). This PDE-based structure makes Eik-QRL trajectory-free, requiring only sampled states and goals, while improving out-of-distribution generalization. We provide theoretical guarantees for Eik-QRL and identify limitations that arise under complex dynamics. To address these challenges, we introduce Eik-Hierarchical QRL (Eik-HiQRL), which integrates Eik-QRL into a hierarchical decomposition. Empirically, Eik-HiQRL achieves state-of-the-art performance in offline goal-conditioned navigation and yields consistent gains over QRL in manipulation tasks, matching temporal-difference methods.
Problem

Research questions and friction points this paper is trying to address.

Goal-Conditioned Reinforcement Learning for goal reaching tasks
Improving generalization in value function learning via quasimetric constraints
Addressing complex dynamics through hierarchical decomposition in RL
Innovation

Methods, ideas, or system contributions that make the work stand out.

Eikonal PDE enables trajectory-free quasimetric learning
Hierarchical decomposition handles complex dynamics in goal reaching
Continuous-time reformulation improves out-of-distribution generalization
🔎 Similar Papers
No similar papers found.