DiCriTest: Testing Scenario Generation for Decision-Making Agents Considering Diversity and Criticality

📅 2025-08-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Safety verification of decision-making agents in dynamic environments faces challenges including susceptibility to local optima in high-dimensional scenario spaces and difficulty balancing scenario diversity with criticality. Method: This paper proposes a dual-space guided testing framework that jointly optimizes the scenario parameter space and agent behavioral space. It introduces a novel parameter–behavior closed-loop feedback mechanism, integrating hierarchical representation, dimensionality reduction modeling, multi-dimensional subspace evaluation, behavioral criticality quantification, and adaptive mode switching to dynamically balance local perturbation and global exploration. Results: Experiments on five decision-making agents show that the framework increases critical scenario generation by 56.23% on average. It significantly outperforms state-of-the-art methods under a joint parameter–behavior driving metric, achieving superior scenario diversity, coverage, and verification effectiveness.

Technology Category

Application Category

📝 Abstract
The growing deployment of decision-making agents in dynamic environments increases the demand for safety verification. While critical testing scenario generation has emerged as an appealing verification methodology, effectively balancing diversity and criticality remains a key challenge for existing methods, particularly due to local optima entrapment in high-dimensional scenario spaces. To address this limitation, we propose a dual-space guided testing framework that coordinates scenario parameter space and agent behavior space, aiming to generate testing scenarios considering diversity and criticality. Specifically, in the scenario parameter space, a hierarchical representation framework combines dimensionality reduction and multi-dimensional subspace evaluation to efficiently localize diverse and critical subspaces. This guides dynamic coordination between two generation modes: local perturbation and global exploration, optimizing critical scenario quantity and diversity. Complementarily, in the agent behavior space, agent-environment interaction data are leveraged to quantify behavioral criticality/diversity and adaptively support generation mode switching, forming a closed feedback loop that continuously enhances scenario characterization and exploration within the parameter space. Experiments show our framework improves critical scenario generation by an average of 56.23% and demonstrates greater diversity under novel parameter-behavior co-driven metrics when tested on five decision-making agents, outperforming state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Balancing diversity and criticality in testing scenarios for decision-making agents
Overcoming local optima in high-dimensional scenario spaces
Generating diverse and critical scenarios via dual-space coordination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-space guided testing framework balances diversity and criticality
Hierarchical representation localizes diverse and critical subspaces
Agent behavior space feedback loop enhances scenario exploration
🔎 Similar Papers
No similar papers found.
Q
Qitong Chu
School of Automation, Beijing Institute of Technology, Beijing 100811, China
Y
Yufeng Yue
School of Automation, Beijing Institute of Technology, Beijing 100811, China
D
Danya Yao
Department of Automation, Tsinghua University, Beijing 100084, China
Huaxin Pei
Huaxin Pei
Tsinghua University
Intelligence TestingMulti-Agent SystemsIntelligent VehiclesCooperative Driving