🤖 AI Summary
This work addresses the Information Bottleneck (IB) problem via neural network–based optimization, aiming to learn a maximally compressed yet task-relevant representation of the input with respect to a target variable. We propose a novel mapping-based IB formulation that reduces the original bivariate optimization to a univariate one, substantially lowering modeling complexity. Based on this formulation, we design a differentiable neural architecture trained end-to-end via gradient descent for data-driven IB estimation. We theoretically establish that the proposed neural estimator converges asymptotically to the IB-optimal solution as the sample size tends to infinity. Experiments on synthetic data and MNIST demonstrate that our method outperforms existing neural IB estimators in representation compression, task relevance, and training stability—achieving both computational efficiency and robustness.
📝 Abstract
The information bottleneck (IB) method is a technique designed to extract meaningful information related to one random variable from another random variable, and has found extensive applications in machine learning problems. In this paper, neural network based estimation of the IB problem solution is studied, through the lens of a novel formulation of the IB problem. Via exploiting the inherent structure of the IB functional and leveraging the mapping approach, the proposed formulation of the IB problem involves only a single variable to be optimized, and subsequently is readily amenable to data-driven estimators based on neural networks. A theoretical analysis is conducted to guarantee that the neural estimator asymptotically solves the IB problem, and the numerical experiments on both synthetic and MNIST datasets demonstrate the effectiveness of the neural estimator.