Discovering and Learning Probabilistic Models of Black-Box AI Capabilities

📅 2025-12-18

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

Evaluating and verifying the planning capabilities of black-box AI systems—such as large language models—in sequential decision-making remains challenging due to their opaque internal mechanisms. Method: This paper proposes an interpretable, probabilistically grounded symbolic capability modeling framework. It introduces PDDL-style formal modeling to explicitly represent a black-box agent’s executable actions, preconditions, and stochastic outcome distributions for the first time. Integrated with Monte Carlo Tree Search, the framework enables automated test task generation and sound, efficient, and convergent iterative pruning of the hypothesis space. Uncertainty quantification and semantic interpretability are jointly achieved via black-box-query-driven probabilistic planning learning. Results: Experiments across multiple black-box AI systems demonstrate significant improvements in modeling accuracy and efficiency. The approach supports verifiability and debuggability—critical requirements for safe deployment.

Technology Category

Application Category

📝 Abstract

Black-box AI (BBAI) systems such as foundational models are increasingly being used for sequential decision making. To ensure that such systems are safe to operate and deploy, it is imperative to develop efficient methods that can provide a sound and interpretable representation of the BBAI's capabilities. This paper shows that PDDL-style representations can be used to efficiently learn and model an input BBAI's planning capabilities. It uses the Monte-Carlo tree search paradigm to systematically create test tasks, acquire data, and prune the hypothesis space of possible symbolic models. Learned models describe a BBAI's capabilities, the conditions under which they can be executed, and the possible outcomes of executing them along with their associated probabilities. Theoretical results show soundness, completeness and convergence of the learned models. Empirical results with multiple BBAI systems illustrate the scope, efficiency, and accuracy of the presented methods.

Problem

Research questions and friction points this paper is trying to address.

Learn probabilistic models of black-box AI planning capabilities

Use PDDL-style representations and Monte-Carlo tree search

Ensure safety and interpretability of AI decision-making systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

PDDL-style representations model black-box AI planning capabilities

Monte-Carlo tree search systematically tests and prunes symbolic hypotheses

Learned probabilistic models describe capabilities, conditions, and outcomes

🔎 Similar Papers

A Systematic Literature Review on Explainability for Machine/Deep Learning-based Software Engineering Research