Decoding Rewards in Competitive Games: Inverse Game Theory with Entropy Regularization

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the ambiguity and data scarcity inherent in reward function identification for two-player zero-sum games by proposing a unified inverse reward learning framework. Leveraging observed agent policies, the framework reconstructs the underlying reward functions in both entropy-regularized static matrix games and dynamic Markov games. The key innovation lies in establishing, for the first time, identifiability conditions for linear reward functions under quantal response equilibrium, and in designing a general-purpose learning algorithm applicable to both static and dynamic settings. By integrating quantal response equilibrium, entropy regularization, and maximum likelihood estimation, the method achieves sample-efficient learning. Theoretical analysis confirms the algorithm’s reliability and sample efficiency, while numerical experiments demonstrate its effectiveness in competitive decision-making environments.

Technology Category

Application Category

📝 Abstract

Estimating the unknown reward functions driving agents'behaviors is of central interest in inverse reinforcement learning and game theory. To tackle this problem, we develop a unified framework for reward function recovery in two-player zero-sum matrix games and Markov games with entropy regularization, where we aim to reconstruct the underlying reward functions given observed players'strategies and actions. This task is challenging due to the inherent ambiguity of inverse problems, the non-uniqueness of feasible rewards, and limited observational data coverage. To address these challenges, we establish the reward function's identifiability using the quantal response equilibrium (QRE) under linear assumptions. Building upon this theoretical foundation, we propose a novel algorithm to learn reward functions from observed actions. Our algorithm works in both static and dynamic settings and is adaptable to incorporate different methods, such as Maximum Likelihood Estimation (MLE). We provide strong theoretical guarantees for the reliability and sample efficiency of our algorithm. Further, we conduct extensive numerical studies to demonstrate the practical effectiveness of the proposed framework, offering new insights into decision-making in competitive environments.

Problem

Research questions and friction points this paper is trying to address.

inverse reinforcement learning

reward function recovery

zero-sum games

entropy regularization

quantal response equilibrium

Innovation

Methods, ideas, or system contributions that make the work stand out.

Inverse Game Theory

Entropy Regularization

Quantal Response Equilibrium