BFTS: Thompson Sampling with Bayesian Additive Regression Trees

📅 2026-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of suboptimal personalized interventions in mobile health, which arises from complex nonlinear user behaviors. To this end, we propose the first contextual bandit algorithm based on Bayesian Additive Regression Trees (BART), directly embedding BART into a Thompson sampling framework to enable flexible nonparametric modeling of the reward function and accurate uncertainty quantification. Theoretical analysis establishes a Bayesian regret bound and demonstrates that a variant of the proposed method achieves minimax optimality in the frequentist sense. Empirical evaluations show that our approach attains state-of-the-art regret performance on standard tabular datasets with well-calibrated uncertainty estimates close to nominal levels. In the Drink Less micro-randomized trial, the method increases user engagement by over 30% compared to existing strategies.

Technology Category

Application Category

📝 Abstract
Contextual bandits are a core technology for personalized mobile health interventions, where decision-making requires adapting to complex, non-linear user behaviors. While Thompson Sampling (TS) is a preferred strategy for these problems, its performance hinges on the quality of the underlying reward model. Standard linear models suffer from high bias, while neural network approaches are often brittle and difficult to tune in online settings. Conversely, tree ensembles dominate tabular data prediction but typically rely on heuristic uncertainty quantification, lacking a principled probabilistic basis for TS. We propose Bayesian Forest Thompson Sampling (BFTS), the first contextual bandit algorithm to integrate Bayesian Additive Regression Trees (BART), a fully probabilistic sum-of-trees model, directly into the exploration loop. We prove that BFTS is theoretically sound, deriving an information-theoretic Bayesian regret bound of $\tilde{O}(\sqrt{T})$. As a complementary result, we establish frequentist minimax optimality for a"feel-good"variant, confirming the structural suitability of BART priors for non-parametric bandits. Empirically, BFTS achieves state-of-the-art regret on tabular benchmarks with near-nominal uncertainty calibration. Furthermore, in an offline policy evaluation on the Drink Less micro-randomized trial, BFTS improves engagement rates by over 30% compared to the deployed policy, demonstrating its practical effectiveness for behavioral interventions.
Problem

Research questions and friction points this paper is trying to address.

contextual bandits
Thompson Sampling
Bayesian Additive Regression Trees
uncertainty quantification
personalized mobile health
Innovation

Methods, ideas, or system contributions that make the work stand out.

Thompson Sampling
Bayesian Additive Regression Trees
Contextual Bandits
Bayesian Regret
Non-parametric Bandits
🔎 Similar Papers
No similar papers found.