🤖 AI Summary
In high-stakes domains (e.g., healthcare, aviation), explainable reinforcement learning (XRL) faces a fundamental trade-off: black-box models lack interpretability, while inherently interpretable models—such as decision trees—suffer from inefficient training. Method: This paper proposes the first population-based XRL framework grounded in social learning mechanisms. It introduces a two-stage training paradigm: (i) collaborative voting, where multiple agents achieve efficient knowledge sharing via distributed interaction and consensus aggregation; and (ii) individual fine-tuning, wherein lightweight RL refines agent-specific policies. Contribution/Results: By integrating social learning into XRL, our approach replaces costly decision-tree induction with low-overhead collaborative training, preserving strict model interpretability while substantially improving sample and computational efficiency. Evaluated on six standard benchmarks, it achieves state-of-the-art performance and accelerates training by 3.2× on average over leading interpretable RL methods.
📝 Abstract
Reinforcement Learning (RL) bears the promise of being an enabling technology for many applications. However, since most of the literature in the field is currently focused on opaque models, the use of RL in high-stakes scenarios, where interpretability is crucial, is still limited. Recently, some approaches to interpretable RL, e.g., based on Decision Trees, have been proposed, but one of the main limitations of these techniques is their training cost. To overcome this limitation, we propose a new population-based method, called Social Interpretable RL (SIRL), inspired by social learning principles, to improve learning efficiency. Our method mimics a social learning process, where each agent in a group learns to solve a given task based both on its own individual experience as well as the experience acquired together with its peers. Our approach is divided into two phases. In the emph{collaborative phase}, all the agents in the population interact with a shared instance of the environment, where each agent observes the state and independently proposes an action. Then, voting is performed to choose the action that will actually be performed in the environment. In the emph{individual phase}, each agent refines its individual performance by interacting with its own instance of the environment. This mechanism makes the agents experience a larger number of episodes while simultaneously reducing the computational cost of the process. Our results on six well-known benchmarks show that SIRL reaches state-of-the-art performance w.r.t. the alternative interpretable methods from the literature.