๐ค AI Summary
This work addresses the lack of theoretical understanding of the uncertainty quantified by Random Network Distillation (RND) and its connections to deep ensembles and Bayesian inference. By analyzing RND within the Neural Tangent Kernel (NTK) framework in the infinite-width limit, we establish that the squared self-prediction error of RND is equivalent to the predictive variance of deep ensembles. Furthermore, by constructing a specific target function, we show that the RND error distribution can be made to match the Bayesian posterior predictive distribution. Building on these insights, we propose a novel Bayesian RND sampling algorithm that efficiently generates i.i.d. samples consistent with the exact Bayesian posterior, thereby achieving uncertainty quantification with both theoretical guarantees and computational efficiency.
๐ Abstract
Uncertainty quantification is central to safe and efficient deployments of deep learning models, yet many computationally practical methods lack lacking rigorous theoretical motivation. Random network distillation (RND) is a lightweight technique that measures novelty via prediction errors against a fixed random target. While empirically effective, it has remained unclear what uncertainties RND measures and how its estimates relate to other approaches, e.g. Bayesian inference or deep ensembles. This paper establishes these missing theoretical connections by analyzing RND within the neural tangent kernel framework in the limit of infinite network width. Our analysis reveals two central findings in this limit: (1) The uncertainty signal from RND -- its squared self-predictive error -- is equivalent to the predictive variance of a deep ensemble. (2) By constructing a specific RND target function, we show that the RND error distribution can be made to mirror the centered posterior predictive distribution of Bayesian inference with wide neural networks. Based on this equivalence, we moreover devise a posterior sampling algorithm that generates i.i.d. samples from an exact Bayesian posterior predictive distribution using this modified \textit{Bayesian RND} model. Collectively, our findings provide a unified theoretical perspective that places RND within the principled frameworks of deep ensembles and Bayesian inference, and offer new avenues for efficient yet theoretically grounded uncertainty quantification methods.