Deep Reinforcement Learning for Real-Time Drone Routing in Post-Disaster Road Assessment Without Domain Knowledge

📅 2025-09-01

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing UAV path planning for post-disaster road damage assessment relies heavily on manual design, suffers from high computational latency (100–2000 s), and exhibits poor generalizability across diverse road networks. Method: We propose a domain-agnostic deep reinforcement learning framework that reformulates link-level routing as a node-level sequential decision-making problem. Our approach employs an attention-based encoder-decoder architecture, integrates a network transformation mechanism, and adopts a multi-task Parallelized Open-loop Monte Carlo Optimization (POMO) training paradigm to enable zero-shot transfer across varying network scales, topologies, and constraints. The model is trained unsupervised on synthetic road networks and deployed directly on real-world infrastructure. Contribution/Results: Experiments demonstrate 16%–69% improvement in solution quality over commercial solvers and inference times of only 1–2 seconds—enabling real-time inspection. This significantly advances beyond traditional optimization methods in both efficiency and generalizability.

Technology Category

Application Category

📝 Abstract

Rapid post-disaster road damage assessment is critical for effective emergency response, yet traditional optimization methods suffer from excessive computational time and require domain knowledge for algorithm design, making them unsuitable for time-sensitive disaster scenarios. This study proposes an attention-based encoder-decoder model (AEDM) for real-time drone routing decision in post-disaster road damage assessment. The method employs deep reinforcement learning to determine high-quality drone assessment routes without requiring algorithmic design knowledge. A network transformation method is developed to convert link-based routing problems into equivalent node-based formulations, while a synthetic road network generation technique addresses the scarcity of large-scale training datasets. The model is trained using policy optimization with multiple optima (POMO) with multi-task learning capabilities to handle diverse parameter combinations. Experimental results demonstrate two key strengths of AEDM: it outperforms commercial solvers by 16--69% in solution quality and achieves real-time inference (1--2 seconds) versus 100--2,000 seconds for traditional methods. The model exhibits strong generalization across varying problem scales, drone numbers, and time constraints, consistently outperforming baseline methods on unseen parameter distributions and real-world road networks. The proposed method effectively balances computational efficiency with solution quality, making it particularly suitable for time-critical disaster response applications where rapid decision-making is essential for saving lives.

Problem

Research questions and friction points this paper is trying to address.

Real-time drone routing for post-disaster road damage assessment

Eliminating domain knowledge requirement in optimization algorithm design

Overcoming computational inefficiency of traditional optimization methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Attention-based encoder-decoder model for drone routing

Deep reinforcement learning without domain knowledge

Network transformation for node-based problem formulation

🔎 Similar Papers

Deep Reinforcement Learning for Dynamic Order Picking in Warehouse Operations