🤖 AI Summary
Autonomous driving systems suffer from high computational overhead: while existing attention mechanisms implicitly filter interacting agents, their quadratic complexity O(n²) hinders real-time inference in dense traffic scenarios. This paper proposes a reinforcement learning–based dynamic agent selection method that formulates critical agent identification as a Markov decision process. A reward-driven policy learns the relevance of each dynamic agent (e.g., vehicles, pedestrians) to ego-vehicle behavior and generates binary saliency masks. Integrated with a pre-trained behavioral model, the approach enables efficient attention pruning while preserving safety, traffic throughput, and overall progress. Experiments on large-scale driving datasets demonstrate substantial input dimensionality reduction—up to an order of magnitude—without degrading decision-making performance. The method establishes a new paradigm for lightweight, interpretable autonomous driving decision-making.
📝 Abstract
Human drivers focus only on a handful of agents at any one time. On the other hand, autonomous driving systems process complex scenes with numerous agents, regardless of whether they are pedestrians on a crosswalk or vehicles parked on the side of the road. While attention mechanisms offer an implicit way to reduce the input to the elements that affect decisions, existing attention mechanisms for capturing agent interactions are quadratic, and generally computationally expensive. We propose RDAR, a strategy to learn per-agent relevance -- how much each agent influences the behavior of the controlled vehicle -- by identifying which agents can be excluded from the input to a pre-trained behavior model. We formulate the masking procedure as a Markov Decision Process where the action consists of a binary mask indicating agent selection. We evaluate RDAR on a large-scale driving dataset, and demonstrate its ability to learn an accurate numerical measure of relevance by achieving comparable driving performance, in terms of overall progress, safety and performance, while processing significantly fewer agents compared to a state of the art behavior model.