Cost-Aware Diffusion Active Search

📅 2026-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of enabling efficient online adaptive search for agents in environments with an unknown number of targets, while balancing exploration, exploitation, and decision-making costs. It introduces diffusion models to active search tasks for the first time, leveraging their sequence modeling capability to directly sample forward-looking action sequences—thereby circumventing the need for computationally expensive search trees and supporting both single-agent and multi-agent collaborative decision-making. To mitigate the optimistic bias inherent in existing diffusion-based reinforcement learning approaches for non-myopic planning, the authors propose a correction mechanism and design a cost-aware, efficient search strategy. Experiments demonstrate that the proposed method achieves higher target recovery rates than standard offline reinforcement learning baselines and offers substantially improved computational efficiency compared to conventional tree-search methods.

Technology Category

Application Category

📝 Abstract
Active search for recovering objects of interest through online, adaptive decision making with autonomous agents requires trading off exploration of unknown environments with exploitation of prior observations in the search space. Prior work has proposed information gain and Thompson sampling based myopic, greedy approaches for agents to actively decide query or search locations when the number of targets is unknown. Decision making algorithms in such partially observable environments have also shown that agents capable of lookahead over a finite horizon outperform myopic policies for active search. Unfortunately, lookahead algorithms typically rely on building a computationally expensive search tree that is simulated and updated based on the agent's observations and a model of the environment dynamics. Instead, in this work, we leverage the sequence modeling abilities of diffusion models to sample lookahead action sequences that balance the exploration-exploitation trade-off for active search without building an exhaustive search tree. We identify the optimism bias in prior diffusion based reinforcement learning approaches when applied to the active search setting and propose mitigating solutions for efficient cost-aware decision making with both single and multi-agent teams. Our proposed algorithm outperforms standard baselines in offline reinforcement learning in terms of full recovery rate and is computationally more efficient than tree search in cost-aware active decision making.
Problem

Research questions and friction points this paper is trying to address.

active search
exploration-exploitation trade-off
cost-aware decision making
partially observable environments
multi-agent systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion models
active search
cost-aware decision making
lookahead planning
multi-agent systems
🔎 Similar Papers
No similar papers found.