🤖 AI Summary
In dense urban environments with high-rise buildings, severe occlusions frequently cause tracking failure for dynamic targets. Method: This paper proposes a drone-based multi-target active collaborative tracking framework leveraging online neural radiance fields (NeRF). It pioneers the integration of online NeRF mapping with information-gain-driven active perception, synergizing RGB-D multi-view fusion, OpenStreetMap-informed simulation environment modeling, and reinforcement learning–inspired trajectory planning. The framework enables end-to-end, first-principles–guided target switching and concurrent map self-optimization. Results: Experiments demonstrate that under strong dynamic occlusion, the maximum tracking error is reduced to 200 m—significantly lower than the baseline of 600 m—and 20 static targets are precisely localized within 300 time steps. Moreover, NeRF reconstruction quality exhibits a positive correlation with tracking accuracy, empirically validating the efficacy of the closed-loop perception–mapping–decision pipeline.
📝 Abstract
We study pursuit-evasion games in highly occluded urban environments, e.g. tall buildings in a city, where a scout (quadrotor) tracks multiple dynamic targets on the ground. We show that we can build a neural radiance field (NeRF) representation of the city—online—using RGB and depth images from different vantage points. This representation is used to calculate the information gain to both explore unknown parts of the city and track the targets—thereby giving a completely first-principles approach to actively tracking dynamic targets. We demonstrate, using a custom-built simulator using Open Street Maps data of Philadelphia and New York City, that we can explore and locate 20 stationary targets within 300 steps. This is slower than a greedy baseline, which does not use active perception. But for dynamic targets that actively hide behind occlusions, we show that our approach maintains, at worst, a tracking error of 200m; the greedy baseline can have a tracking error as large as 600m. We observe a number of interesting properties in the scout’s policies, e.g., it switches its attention to track a different target periodically, as the quality of the NeRF representation improves over time, the scout also becomes better in terms of target tracking.