Probabilistic Verification of Recurrent Neural Networks for Single and Multi-Agent Reinforcement Learning

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the challenge of verifying recurrent neural network (RNN)-based policies in partially observable reinforcement learning, where reliance on latent state dynamics renders existing methods either overly conservative due to strong assumptions or imprecise due to coarse approximations. The paper introduces RNN-ProVe, the first framework enabling feasibility-aware probabilistic verification of RNN policies in both single-agent and multi-agent settings. By integrating policy-guided sampling, feasible set approximation of latent states, and rigorous statistical error bound analysis, RNN-ProVe provides high-confidence, bounded-error probabilistic guarantees. Empirical evaluations demonstrate that RNN-ProVe significantly outperforms existing tools in verification accuracy and scalability across multiple partially observable tasks, effectively supporting policy verification for architectures with recurrent structures and multi-agent interactions.

📝 Abstract

History-dependent policies induced by recurrent neural networks (RNNs) rely on latent hidden state dynamics, making verification in partially observable reinforcement learning (RL) challenging. Existing RNN verification tools typically rely on restrictive modeling assumptions or coarse over-approximations of the hidden state space, which can lead to overly conservative or inconclusive results. We propose $\textbf{RNN}$ $\textbf{Pro}$babilistic $\textbf{Ve}$rification ($\texttt{RNN-ProVe}$), a probabilistic framework that $\textit{estimates the likelihood}$ of undesired behaviors in RNN-based policies. $\texttt{RNN-ProVe}$ uses policy-driven sampling to approximate the set of hidden states that are feasible under a trained policy, and derives statistical error bounds to produce bounded-error, high-confidence estimates of behavioral violations. Experiments on partially observable single-agent and cooperative multi-agent tasks show that $\texttt{RNN-ProVe}$ yields more quantitative, feasibility-aware probabilistic guarantees than existing tools, while scaling to recurrent and multi-agent settings.

Problem

Research questions and friction points this paper is trying to address.

recurrent neural networks

probabilistic verification

partially observable reinforcement learning

hidden state dynamics

behavioral guarantees

Innovation

Methods, ideas, or system contributions that make the work stand out.

probabilistic verification

recurrent neural networks

partially observable reinforcement learning