🤖 AI Summary
This study investigates how to reverse-engineer a “counterfactual” data distribution from the posterior that precisely removes specific observed data without introducing or discarding any additional information. We reinterpret Bayes’ theorem as an optimal rule for information deletion, revealing its inherent symmetry and optimality in data removal scenarios. Building upon variational inference and information-theoretic principles, we develop a backward updating framework constrained by strict information conservation—ensuring no creation or destruction of information—and formally demonstrate that this optimal deletion procedure is equivalent to conventional Bayesian updating. This work establishes a theoretical foundation for reversible Bayesian inference and machine unlearning, offering principled insights into how probabilistic models can faithfully forget selected data while preserving overall coherence.
📝 Abstract
In this same journal, Arnold Zellner published a seminal paper on Bayes'theorem as an optimal information processing rule. This result led to the variational formulation of Bayes'theorem, which is the central idea in generalized variational inference. Almost 40 years later, we revisit these ideas, but from the perspective of information deletion. We investigate rules which update a posterior distribution into an antedata distribution when a portion of data is removed. In such context, a rule which does not destroy or create information is called the optimal information deletion rule and we prove that it coincides with the traditional use of Bayes'theorem.