Bayesian Ambiguity Contraction-based Adaptive Robust Markov Decision Processes for Adversarial Surveillance Missions

📅 2025-12-01

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

In adversarial intelligent surveillance and reconnaissance (ISR) tasks, conventional conservative robust Markov decision processes (RMDPs) struggle to balance safety and real-time decision-making under model uncertainty and dynamic threats. To address this, we propose an Adaptive Robust Markov Decision Process (AR-MDP) framework. Its core innovation is a Bayesian fuzzy contraction mechanism that dynamically shrinks the ambiguity set via online observations, eliminating infeasible threat models and enabling adaptive policy evolution from over-conservatism toward robust proactiveness. The method integrates robust MDPs, Bayesian inference, and finite transition kernel modeling, and introduces a state-alternating structure to enable joint perception–maneuver planning. We provide theoretical guarantees on convergence and safety. Experiments demonstrate that, under both Gaussian and non-Gaussian threat models, AR-MDP significantly outperforms baseline RMDPs—increasing mission reward by 23.6% and reducing exposure rate by 41.2%—while maintaining strong adaptability across diverse network topologies, indicating high practical deployability.

Technology Category

Application Category

📝 Abstract

Collaborative Combat Aircraft (CCAs) are envisioned to enable autonomous Intelligence, Surveillance, and Reconnaissance (ISR) missions in contested environments, where adversaries may act strategically to deceive or evade detection. These missions pose challenges due to model uncertainty and the need for safe, real-time decision-making. Robust Markov Decision Processes (RMDPs) provide worst-case guarantees but are limited by static ambiguity sets that capture initial uncertainty without adapting to new observations. This paper presents an adaptive RMDP framework tailored to ISR missions with CCAs. We introduce a mission-specific formulation in which aircraft alternate between movement and sensing states. Adversarial tactics are modeled as a finite set of transition kernels, each capturing assumptions about how adversarial sensing or environmental conditions affect rewards. Our approach incrementally refines policies by eliminating inconsistent threat models, allowing agents to shift from conservative to aggressive behaviors while maintaining robustness. We provide theoretical guarantees showing that the adaptive planner converges as credible sets contract to the true threat and maintains safety under uncertainty. Experiments under Gaussian and non-Gaussian threat models across diverse network topologies show higher mission rewards and fewer exposure events compared to nominal and static robust planners.

Problem

Research questions and friction points this paper is trying to address.

Adaptive robust decision-making for adversarial surveillance missions

Refining policies by eliminating inconsistent threat models incrementally

Maintaining safety while shifting from conservative to aggressive behaviors

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive RMDP framework for ISR missions

Incremental policy refinement by eliminating inconsistent threat models

Theoretical guarantees for convergence and safety under uncertainty

🔎 Similar Papers

A Survey and Evaluation of Adversarial Attacks for Object Detection