Evolution of Information in Interactive Decision Making: A Case Study for Multi-Armed Bandits

📅 2025-03-01

📈 Citations: 0

✨ Influential: 0

career value

249K/year

🤖 AI Summary

This paper investigates the information evolution mechanism in interactive decision-making for stochastic multi-armed bandits. To characterize the dynamic relationship between the optimal success probability and mutual information over time, we propose a time-resolved joint analytical framework integrating information-theoretic measures (mutual information, KL divergence) with asymptotic statistical analysis. We establish, for the first time, a three-phase growth pattern of mutual information—linear → quadratic → linear—and rigorously prove that this non-monotonic evolution arises intrinsically from interaction, contrasting sharply with non-interactive settings. Furthermore, we demonstrate that optimal learning and maximal information gain are decoupled, challenging the conventional “information maximization implies optimal learning” assumption. Our results provide the first precise temporal information benchmark and theoretical foundation for interactive learning. (132 words)

Technology Category

Application Category

📝 Abstract

We study the evolution of information in interactive decision making through the lens of a stochastic multi-armed bandit problem. Focusing on a fundamental example where a unique optimal arm outperforms the rest by a fixed margin, we characterize the optimal success probability and mutual information over time. Our findings reveal distinct growth phases in mutual information -- initially linear, transitioning to quadratic, and finally returning to linear -- highlighting curious behavioral differences between interactive and non-interactive environments. In particular, we show that optimal success probability and mutual information can be decoupled, where achieving optimal learning does not necessarily require maximizing information gain. These findings shed new light on the intricate interplay between information and learning in interactive decision making.

Problem

Research questions and friction points this paper is trying to address.

Evolution of information in interactive decision making.

Characterizing optimal success probability and mutual information.

Decoupling optimal learning and information gain in bandit problems.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Characterizes optimal success probability over time

Identifies distinct mutual information growth phases

Decouples optimal learning from information maximization

🔎 Similar Papers

Multi-Player Approaches for Dueling Bandits