🤖 AI Summary
This work addresses the failure of classical statistical inference under adaptive data collection schemes such as LinUCB, where data dependence invalidates standard assumptions. The authors propose a novel condition termed “directional stability,” which is weaker than existing target-independent stability assumptions, and develop a semiparametric efficiency theory within this framework. By leveraging martingale representations of the canonical gradient, predictable quadratic variation analysis, and a convolution theorem, they characterize the efficiency frontier under adaptive sampling. They establish that first-order estimators satisfying directional stability attain this efficiency bound and, for the first time, provide asymptotic normality and semiparametric efficiency guarantees for regular scalar functionals under LinUCB sampling, even in high-dimensional settings.
📝 Abstract
We study inference on scalar-valued pathwise differentiable targets after adaptive data collection, such as a bandit algorithm. We introduce a novel target-specific condition, directional stability, which is strictly weaker than previously imposed target-agnostic stability conditions. Under directional stability, we show that estimators that would have been efficient under i.i.d. data remain asymptotically normal and semiparametrically efficient when computed from adaptively collected trajectories. The canonical gradient has a martingale form, and directional stability guarantees stabilization of its predictable quadratic variation, enabling high-dimensional asymptotic normality. We characterize efficiency using a convolution theorem for the adaptive-data setting, and give a condition under which the one-step estimator attains the efficiency bound. We verify directional stability for LinUCB, yielding the first semiparametric efficiency guarantee for a regular scalar target under LinUCB sampling.