🤖 AI Summary
This work addresses the challenge of tightly coupling real-time Bayesian inference with intelligent data acquisition in applications such as autonomous scientific discovery and personalized medicine. We propose the first end-to-end framework that jointly optimizes amortized inference and active sampling. Methodologically: (1) we design a Transformer-based reinforcement learning architecture, using self-supervised information gain estimation as the reward signal; (2) we enable differentiable joint modeling to support parameter- or task-directed queries; and (3) we unify probabilistic inference and experimental design, establishing a closed-loop synergy between new data acquisition and instantaneous inference. Evaluated on regression-based active learning, classical Bayesian experimental design, and psychometric tasks, our approach significantly improves both inference accuracy and sampling efficiency—reducing response latency to the millisecond level and markedly increasing information density.
📝 Abstract
Many critical applications, from autonomous scientific discovery to personalized medicine, demand systems that can both strategically acquire the most informative data and instantaneously perform inference based upon it. While amortized methods for Bayesian inference and experimental design offer part of the solution, neither approach is optimal in the most general and challenging task, where new data needs to be collected for instant inference. To tackle this issue, we introduce the Amortized Active Learning and Inference Engine (ALINE), a unified framework for amortized Bayesian inference and active data acquisition. ALINE leverages a transformer architecture trained via reinforcement learning with a reward based on self-estimated information gain provided by its own integrated inference component. This allows it to strategically query informative data points while simultaneously refining its predictions. Moreover, ALINE can selectively direct its querying strategy towards specific subsets of model parameters or designated predictive tasks, optimizing for posterior estimation, data prediction, or a mixture thereof. Empirical results on regression-based active learning, classical Bayesian experimental design benchmarks, and a psychometric model with selectively targeted parameters demonstrate that ALINE delivers both instant and accurate inference along with efficient selection of informative points.