Learning on the Fly: Replay-Based Continual Object Perception for Indoor Drones

📅 2026-02-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of catastrophic forgetting in indoor drones during continual learning of new object categories, exacerbated by the absence of temporally consistent indoor scene datasets. To this end, the authors introduce a novel indoor drone video dataset comprising 14,400 frames and evaluate three replay-based class-incremental learning methods—Experience Replay (ER), Mirrored Replay (MIR), and Feature-Aware Replay (FAR)—on resource-constrained edge platforms using a lightweight YOLOv11-nano detector. Experimental results demonstrate that FAR achieves an mAP50-95 of 82.96% with only 5%–10% replay memory budget, confirming the efficacy of replay-based continual learning for drone systems. This work presents the first temporally consistent indoor drone vision dataset and shows that extremely low-memory replay strategies can effectively support continuous object perception on aerial edge devices.

Technology Category

Application Category

📝 Abstract
Autonomous agents such as indoor drones must learn new object classes in real-time while limiting catastrophic forgetting, motivating Class-Incremental Learning (CIL). However, most unmanned aerial vehicle (UAV) datasets focus on outdoor scenes and offer limited temporally coherent indoor videos. We introduce an indoor dataset of $14,400$ frames capturing inter-drone and ground vehicle footage, annotated via a semi-automatic workflow with a $98.6\%$ first-pass labeling agreement before final manual verification. Using this dataset, we benchmark 3 replay-based CIL strategies: Experience Replay (ER), Maximally Interfered Retrieval (MIR), and Forgetting-Aware Replay (FAR), using YOLOv11-nano as a resource-efficient detector for deployment-constrained UAV platforms. Under tight memory budgets ($5-10\%$ replay), FAR performs better than the rest, achieving an average accuracy (ACC, $mAP_{50-95}$ across increments) of $82.96\%$ with $5\%$ replay. Gradient-weighted class activation mapping (Grad-CAM) analysis shows attention shifts across classes in mixed scenes, which is associated with reduced localization quality for drones. The experiments further demonstrate that replay-based continual learning can be effectively applied to edge aerial systems. Overall, this work contributes an indoor UAV video dataset with preserved temporal coherence and an evaluation of replay-based CIL under limited replay budgets. Project page: https://spacetime-vision-robotics-laboratory.github.io/learning-on-the-fly-cl
Problem

Research questions and friction points this paper is trying to address.

Class-Incremental Learning
Catastrophic Forgetting
Indoor Drones
Continual Object Perception
Replay-Based Learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Class-Incremental Learning
Replay-Based Continual Learning
Indoor UAV Dataset
Forgetting-Aware Replay
Edge Aerial Systems
🔎 Similar Papers
No similar papers found.
S
Sebastian-Ion Nae
National University of Science and Technology Politehnica Bucharest, Romania
M
Mihai-Eugen Barbu
National University of Science and Technology Politehnica Bucharest, Romania
S
Sebastian Mocanu
National University of Science and Technology Politehnica Bucharest, Romania
Marius Leordeanu
Marius Leordeanu
Professor, University Politehnica Bucharest and Norwegian Research Center (NORCE)
Computer VisionMachine LearningRoboticsArtificial Intelligence