🤖 AI Summary
This paper addresses the challenge of asymptotic optimality in pure exploration when the answer set is infinite—e.g., continuous function regression—where existing methods, asymptotically optimal only for finite answer sets, fail. To characterize the fundamental difficulty, we establish the first instance-dependent information-theoretic lower bound for infinite-answer settings. We propose the Sticky-Sequence Track-and-Stop framework, which unifies and generalizes prior Track-and-Stop approaches, enabling asymptotic optimality in infinite-answer regimes. Leveraging optimal stopping theory, adaptive sampling, and a novel sequential tracking mechanism, we provide a rigorous proof of its asymptotic optimality and identify several sufficient conditions under which optimality is preserved. Our work delivers the first theoretically complete pure-exploration paradigm for active learning in high-dimensional or continuous decision spaces.
📝 Abstract
We study pure exploration problems where the set of correct answers is possibly infinite, e.g., the regression of any continuous function of the means of the bandit. We derive an instance-dependent lower bound for these problems. By analyzing it, we discuss why existing methods (i.e., Sticky Track-and-Stop) for finite answer problems fail at being asymptotically optimal in this more general setting. Finally, we present a framework, Sticky-Sequence Track-and-Stop, which generalizes both Track-and-Stop and Sticky Track-and-Stop, and that enjoys asymptotic optimality. Due to its generality, our analysis also highlights special cases where existing methods enjoy optimality.