Trajectory Planning for UAV-Based Smart Farming Using Imitation-Based Triple Deep Q-Learning

πŸ“… 2025-12-21
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Trajectory planning for agricultural drones under high uncertainty, partial observability, and stringent energy constraints remains challenging in intelligent farming. Method: We formulate the problem as a multi-agent Markov decision process (MAMDP) and propose an Imitation-based Triple Deep Q-Network (ITDQN). ITDQN integrates an elite imitation mechanism to reduce exploration cost and introduces an auxiliary Q-network to enhance training stability and accelerate convergence over standard Double DQN (DDQN). Contribution/Results: Evaluated in both simulated and real-world farmland environments, ITDQN achieves a 4.43% improvement in weed detection accuracy and a 6.94% increase in wireless sensor data collection rate compared to baseline DDQN. This work provides a scalable, robust reinforcement learning framework for resource-constrained, autonomous cooperative perception among agricultural drones.

Technology Category

Application Category

πŸ“ Abstract
Unmanned aerial vehicles (UAVs) have emerged as a promising auxiliary platform for smart agriculture, capable of simultaneously performing weed detection, recognition, and data collection from wireless sensors. However, trajectory planning for UAV-based smart agriculture is challenging due to the high uncertainty of the environment, partial observations, and limited battery capacity of UAVs. To address these issues, we formulate the trajectory planning problem as a Markov decision process (MDP) and leverage multi-agent reinforcement learning (MARL) to solve it. Furthermore, we propose a novel imitation-based triple deep Q-network (ITDQN) algorithm, which employs an elite imitation mechanism to reduce exploration costs and utilizes a mediator Q-network over a double deep Q-network (DDQN) to accelerate and stabilize training and improve performance. Experimental results in both simulated and real-world environments demonstrate the effectiveness of our solution. Moreover, our proposed ITDQN outperforms DDQN by 4.43% in weed recognition rate and 6.94% in data collection rate.
Problem

Research questions and friction points this paper is trying to address.

Optimizes UAV trajectory planning for smart farming tasks
Addresses environmental uncertainty and partial observation challenges
Improves weed recognition and data collection efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses imitation-based triple deep Q-network algorithm
Employs elite imitation to reduce exploration costs
Utilizes mediator Q-network for stable training
πŸ”Ž Similar Papers
No similar papers found.
W
Wencan Mao
National Institute of Informatics, Tokyo, Japan
Q
Quanxi Zhou
The University of Tokyo, Tokyo, Japan
T
Tomas Couso Coddou
Pontificia Universidad CatΓ³lica de Chile, Santiago, Chile
Manabu Tsukada
Manabu Tsukada
The University of Tokyo
Computer NetworkingInternet MobilityITS
Y
Yunling Liu
China Agricultural University, Beijing, China
Yusheng Ji
Yusheng Ji
National Institute of Informatics
networkwireless networkingmobile computing