On Generating Explanations for Reinforcement Learning Policies: An Empirical Study

📅 2023-09-29

🏛️ IEEE Control Systems Letters

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

To address the limited interpretability of reinforcement learning (RL) policies, this paper proposes an automated explanation generation method grounded in Linear Temporal Logic (LTL). The core method employs the action distribution divergence between the target policy and the policy induced by a candidate LTL formula as a guiding signal for LTL formula search—thereby avoiding overly general “universal explanations” and ensuring strategy-specificity and formal verifiability. Integrating Monte Carlo policy optimization with multi-scenario simulation (flag capture, parking, and robot navigation), the approach successfully generates concise, semantically transparent, and human-understandable LTL explanations across all three domains. Experimental results demonstrate that our method significantly outperforms existing baselines in both explanation accuracy and fidelity, confirming its strong generalization capability and practical utility for interpretable RL.

📝 Abstract

Explaining reinforcement learning policies is important for deploying them in real-world scenarios. We introduce a set of linear temporal logic formulae designed to provide such explanations, and an algorithm for searching through those formulae for the one that best explains a given policy. Our key idea is to compare action distributions from the target policy with those from policies optimized for candidate explanations. This comparison provides more insight into the target policy than existing methods and avoids inference of “catch-all” explanations. We demonstrate our method in a simulated game of capture-the-flag, a car-parking environment, and a robot navigation task.

Problem

Research questions and friction points this paper is trying to address.

Generating explanations for reinforcement learning policies

Using linear temporal logic to explain policy objectives

Validating approach with capture-the-flag and car-parking simulations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses linear temporal logic for explanations

Searches best formula to explain policies

Tests in capture-the-flag and car-parking

🔎 Similar Papers

No similar papers found.