Value-Decomposed Reinforcement Learning Framework for Taxiway Routing with Hierarchical Conflict-Aware Observations

📅 2026-05-09
📈 Citations: 0
Influential: 0
📄 PDF

career value

186K/year
🤖 AI Summary
This study addresses the challenge of simultaneously achieving real-time performance, safety, and efficiency in multi-aircraft taxiway routing at airports, particularly the difficulties in modeling downstream traffic conflicts and balancing competing optimization objectives. To this end, the authors propose a reinforcement learning approach grounded in a grid-based representation of the airport surface. By integrating action masking, hierarchical predictive encoding of traffic states, and a value decomposition mechanism, the method effectively perceives both immediate and anticipated conflicts while prioritizing sparse yet critical safety constraints. Evaluated in a simulated environment modeled after Changsha Huanghua International Airport, the proposed approach consistently outperforms existing planning, optimization, and reinforcement learning baselines across varying traffic densities, achieving superior trade-offs between safety and efficiency without compromising real-time operational requirements.
📝 Abstract
Taxiway routing and on-surface conflict avoidance are coupled safety-critical decision problems in airport surface operations. Existing planning and optimization methods are often limited by online computational cost, while reinforcement learning methods may struggle to represent downstream traffic conflicts and balance multiple objectives. This paper presents Conflict-aware Taxiway Routing (CaTR), a reinforcement learning framework for real-time multi-aircraft taxiway routing. CaTR constructs a grid-based airport surface environment with action masking, introduces a hierarchical foresight traffic representation to encode current and downstream conflict-related traffic conditions, and adopts a value-decomposed reinforcement learning strategy to prioritize sparse but safety-critical objectives. Experiments are conducted on a realistic environment based on Changsha Huanghua International Airport under multiple traffic density levels. Results show that CaTR achieves better safety--efficiency trade-offs than representative planning, optimization, and reinforcement learning baselines while maintaining practical runtime.
Problem

Research questions and friction points this paper is trying to address.

taxiway routing
conflict avoidance
airport surface operations
multi-aircraft coordination
safety-critical decision making
Innovation

Methods, ideas, or system contributions that make the work stand out.

value-decomposed reinforcement learning
hierarchical conflict-aware observations
taxiway routing
multi-aircraft coordination
action masking
🔎 Similar Papers
No similar papers found.