Neural Backward Reach-Avoid Tubes with MPC Supervision for High-Dimensional Systems: An Application to Safe Spacecraft Docking

📅 2026-05-03

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the challenge of simultaneously ensuring obstacle avoidance and reachability in high-dimensional spacecraft docking, where traditional Hamilton-Jacobi (HJ) methods suffer from poor scalability. The authors propose a model predictive control (MPC)-supervised Backward Reach-Avoid Tube (BRAT) framework that integrates MPC supervision into HJ learning for the first time. During offline training, a neural HJ value function is jointly optimized using PDE-based losses and a curriculum of MPC-guided trajectories; online, safe and efficient control is achieved by combining the gradient of this learned value function with terminal MPC refinement. Evaluated on 6D and 13D docking tasks, the method significantly outperforms existing approaches in both success rate and computational efficiency, with accuracy validated against grid-based solutions.

📝 Abstract

Autonomous spacecraft docking requires control policies that simultaneously ensure collision avoidance and target reachability under coupled, high-dimensional translational-rotational dynamics. Hamilton-Jacobi (HJ) reachability provides formal reach-avoid guarantees, but classical solvers are limited to low-dimensional systems. Learning-based approaches have begun to scale HJ analysis, yet they struggle in reach-avoid settings, especially where goal and failure sets are tightly coupled, as in docking. We propose a learning-based Backward Reach-Avoid Tube (BRAT) framework that addresses this challenge by tightly integrating HJ structure with MPC-based supervision. In the offline phase, we train a neural approximation of the HJ value function using PDE-based losses augmented with curriculum-driven MPC supervision, which provides informative value targets and stabilizes training in regions where purely PDE-based methods fail. In the online phase, the learned value function is deployed through two real-time controllers: (i) a value gradient-driven controller, and (ii) a value-function-augmented terminal MPC that explicitly enforces reachability at the horizon. We evaluate the proposed method on a 6D planar docking problem against grid-based ground truth and then scale to the full 13D system. Across both settings, our approach outperforms existing methods in success rate and computational efficiency.

Problem

Research questions and friction points this paper is trying to address.

spacecraft docking

reach-avoid

high-dimensional systems

Hamilton-Jacobi reachability

collision avoidance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Backward Reach-Avoid Tube

Hamilton-Jacobi reachability

MPC supervision