🤖 AI Summary
This work addresses the inefficiency of static multispectral sensor configurations in dynamic environments, which often lead to unnecessary consumption of bandwidth, computational resources, and energy due to the inability to adaptively prioritize critical sensing modalities. To overcome this limitation, the authors propose a closed-loop adaptive perception framework that leverages a task-aware detection backbone to evaluate the contribution of modalities—including RGB, infrared, millimeter-wave, and depth—and employs a reinforcement learning agent to dynamically optimize hardware parameters such as sensor sampling frequency and resolution in real time. Crucially, the framework incorporates physical sensing costs into the decision-making process, enabling end-to-end dynamic control from feature reweighting to hardware configuration. Experiments on a mobile robotic platform demonstrate that the method reduces GPU load by 29.3% with only a 5.3% drop in accuracy compared to heuristic baselines.
📝 Abstract
Multi-sensor fusion is central to robust robotic perception, yet most existing systems operate under static sensor configurations, collecting all modalities at fixed rates and fidelity regardless of their situational utility. This rigidity wastes bandwidth, computation, and energy, and prevents systems from prioritizing sensors under challenging conditions such as poor lighting or occlusion. Recent advances in reinforcement learning (RL) and modality-aware fusion suggest the potential for adaptive perception, but prior efforts have largely focused on re-weighting features at inference time, ignoring the physical cost of sensor data collection. We introduce a framework that unifies sensing, learning, and actuation into a closed reconfiguration loop. A task-specific detection backbone extracts multispectral features (e.g. RGB, IR, mmWave, depth) and produces quantitative contribution scores for each modality. These scores are passed to an RL agent, which dynamically adjusts sensor configurations, including sampling frequency, resolution, sensing range, and etc., in real time. Less informative sensors are down-sampled or deactivated, while critical sensors are sampled at higher fidelity as environmental conditions evolve. We implement and evaluate this framework on a mobile rover, showing that adaptive control reduces GPU load by 29.3\% with only a 5.3\% accuracy drop compared to a heuristic baseline. These results highlight the potential of resource-aware adaptive sensing for embedded robotic platforms.