Bayesian Inference of Contextual Bandit Policies via Empirical Likelihood

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This work addresses the challenge of evaluating and comparing multiple contextual bandit policies under limited data by proposing a Bayesian inference framework based on empirical likelihood. For the first time in this domain, empirical likelihood is integrated to enable joint posterior inference over multiple policies without requiring strong parametric assumptions. This approach significantly enhances the robustness and flexibility of uncertainty quantification in small-sample settings. Experimental results on both Monte Carlo simulations and a real-world adolescent BMI dataset demonstrate that the proposed method accurately estimates policy values and provides reliable confidence measures for policy comparisons.

Technology Category

Application Category

📝 Abstract

Policy inference plays an essential role in the contextual bandit problem. In this paper, we use empirical likelihood to develop a Bayesian inference method for the joint analysis of multiple contextual bandit policies in finite sample regimes. The proposed inference method is robust to small sample sizes and is able to provide accurate uncertainty measurements for policy value evaluation. In addition, it allows for flexible inferences on policy comparison with full uncertainty quantification. We demonstrate the effectiveness of the proposed inference method using Monte Carlo simulations and its application to an adolescent body mass index data set.

Problem

Research questions and friction points this paper is trying to address.

contextual bandit

policy inference

uncertainty quantification

small sample

policy comparison

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian inference

empirical likelihood

contextual bandits