Learning on the Fly: Replay-Based Continual Object Perception for Indoor Drones

📅 2026-02-13

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

This study addresses the challenge of catastrophic forgetting in indoor drones during continual learning of new object categories, exacerbated by the absence of temporally consistent indoor scene datasets. To this end, the authors introduce a novel indoor drone video dataset comprising 14,400 frames and evaluate three replay-based class-incremental learning methods—Experience Replay (ER), Mirrored Replay (MIR), and Feature-Aware Replay (FAR)—on resource-constrained edge platforms using a lightweight YOLOv11-nano detector. Experimental results demonstrate that FAR achieves an mAP50-95 of 82.96% with only 5%–10% replay memory budget, confirming the efficacy of replay-based continual learning for drone systems. This work presents the first temporally consistent indoor drone vision dataset and shows that extremely low-memory replay strategies can effectively support continuous object perception on aerial edge devices.

Technology Category

Application Category

📝 Abstract

Autonomous agents such as indoor drones must learn new object classes in real-time while limiting catastrophic forgetting, motivating Class-Incremental Learning (CIL). However, most unmanned aerial vehicle (UAV) datasets focus on outdoor scenes and offer limited temporally coherent indoor videos. We introduce an indoor dataset of $14,400$ frames capturing inter-drone and ground vehicle footage, annotated via a semi-automatic workflow with a $98.6\%$ first-pass labeling agreement before final manual verification. Using this dataset, we benchmark 3 replay-based CIL strategies: Experience Replay (ER), Maximally Interfered Retrieval (MIR), and Forgetting-Aware Replay (FAR), using YOLOv11-nano as a resource-efficient detector for deployment-constrained UAV platforms. Under tight memory budgets ($5-10\%$ replay), FAR performs better than the rest, achieving an average accuracy (ACC, $mAP_{50-95}$ across increments) of $82.96\%$ with $5\%$ replay. Gradient-weighted class activation mapping (Grad-CAM) analysis shows attention shifts across classes in mixed scenes, which is associated with reduced localization quality for drones. The experiments further demonstrate that replay-based continual learning can be effectively applied to edge aerial systems. Overall, this work contributes an indoor UAV video dataset with preserved temporal coherence and an evaluation of replay-based CIL under limited replay budgets. Project page: https://spacetime-vision-robotics-laboratory.github.io/learning-on-the-fly-cl

Problem

Research questions and friction points this paper is trying to address.

Class-Incremental Learning

Catastrophic Forgetting

Indoor Drones

Continual Object Perception

Replay-Based Learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Class-Incremental Learning

Replay-Based Continual Learning

Indoor UAV Dataset