Dynamic Decision-Making under Model Misspecification

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies dynamic decision-making in finite-parameter settings under model misspecification—specifically, when the conditional expected reward function is incorrectly specified. To address the failure of conventional algorithms to guarantee parameter convergence and regret control under misspecification, we introduce the notion of a “pseudo-true set” and extend White’s (1982) theory, proving that the parameter posterior converges exponentially fast to this set. We further derive necessary and sufficient conditions for the posterior mean to converge within the pseudo-true set. Methodologically, our approach integrates Bayesian updating, maximum a posteriori (MAP) estimation, and the Thompson Sampling framework. Our key contributions are: (i) the first establishment of exponential-rate parameter convergence under misspecification; and (ii) simultaneous robustness guarantees on the average regret rate—significantly outperforming existing methods in both theoretical rigor and empirical performance.

Technology Category

Application Category

📝 Abstract
In this study, I investigate the dynamic decision problem with a finite parameter space when the functional form of conditional expected rewards is misspecified. Traditional algorithms, such as Thompson Sampling, guarantee neither an $O(e^{-T})$ rate of posterior parameter concentration nor an $O(T^{-1})$ rate of average regret. However, under mild conditions, we can still achieve an exponential convergence rate of the parameter to a pseudo truth set, an extension of the pseudo truth parameter concept introduced by White (1982). I further characterize the necessary conditions for the convergence of the expected posterior within this pseudo-truth set. Simulations demonstrate that while the maximum a posteriori (MAP) estimate of the parameters fails to converge under misspecification, the algorithm's average regret remains relatively robust compared to the correctly specified case. These findings suggest opportunities to design simple yet robust algorithms that achieve desirable outcomes even in the presence of model misspecifications.
Problem

Research questions and friction points this paper is trying to address.

Dynamic decision-making with misspecified reward models
Achieving exponential parameter convergence to pseudo truth
Robust regret performance under model misspecification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic decision-making with finite parameter space
Exponential convergence to pseudo truth set
Robust algorithm under model misspecification
🔎 Similar Papers