AutoODD: Agentic Audits via Bayesian Red Teaming in Black-Box Models

πŸ“… 2025-09-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Defining the Operational Design Domain (ODD) for black-box machine learning models in high-risk applications remains challenging due to reliance on domain expertise and prohibitively high manual auditing costs. Method: This paper proposes an LLM-Agent–based intelligent auditing framework that integrates Bayesian red-teaming with uncertainty-aware failure distribution modeling. It enables efficient exploration of high-dimensional input spaces via low-dimensional projection onto text embedding manifolds. The LLM-Agent acts as a coordinator, orchestrating test case generation, model response collection, and uncertainty estimation. Results: Evaluated on MNIST-based digit omission simulation and real-world visual drone intrusion detection, the framework automatically identifies critical failure regions, significantly reducing dependence on expert knowledge while improving ODD delineation efficiency and regulatory compliance assurance.

Technology Category

Application Category

πŸ“ Abstract
Specialized machine learning models, regardless of architecture and training, are susceptible to failures in deployment. With their increasing use in high risk situations, the ability to audit these models by determining their operational design domain (ODD) is crucial in ensuring safety and compliance. However, given the high-dimensional input spaces, this process often requires significant human resources and domain expertise. To alleviate this, we introduce coolname, an LLM-Agent centric framework for automated generation of semantically relevant test cases to search for failure modes in specialized black-box models. By leveraging LLM-Agents as tool orchestrators, we aim to fit a uncertainty-aware failure distribution model on a learned text-embedding manifold by projecting the high-dimension input space to low-dimension text-embedding latent space. The LLM-Agent is tasked with iteratively building the failure landscape by leveraging tools for generating test-cases to probe the model-under-test (MUT) and recording the response. The agent also guides the search using tools to probe uncertainty estimate on the low dimensional manifold. We demonstrate this process in a simple case using models trained with missing digits on the MNIST dataset and in the real world setting of vision-based intruder detection for aerial vehicles.
Problem

Research questions and friction points this paper is trying to address.

Automated audit of black-box models for failure modes
Reducing human effort in high-dimensional input testing
Generating semantically relevant test cases via LLM-Agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-Agent framework for automated test generation
Bayesian red teaming in black-box models
Uncertainty-aware failure distribution modeling
πŸ”Ž Similar Papers
No similar papers found.