🤖 AI Summary
This work addresses the high computational cost and sensitivity to input noise inherent in traditional capsule networks due to iterative dynamic routing. The authors introduce, for the first time, the information bottleneck principle into capsule networks and propose a one-shot variational aggregation mechanism that eliminates iterative routing altogether. By leveraging global context compression and class-specific variational autoencoders, the method directly infers latent capsules in a single pass. This approach substantially enhances both efficiency and robustness: on benchmarks such as MNIST, it achieves an average accuracy improvement of over 14% under noisy conditions, accelerates training by 2.54×, increases inference throughput by 3.64×, reduces parameter count by 4.66%, and maintains high accuracy on clean data.
📝 Abstract
Capsule networks (CapsNets) are superior at modeling hierarchical spatial relationships but suffer from two critical limitations: high computational cost due to iterative dynamic routing and poor robustness under input corruptions. To address these issues, we propose IBCapsNet, a novel capsule architecture grounded in the Information Bottleneck (IB) principle. Instead of iterative routing, IBCapsNet employs a one-pass variational aggregation mechanism, where primary capsules are first compressed into a global context representation and then processed by class-specific variational autoencoders (VAEs) to infer latent capsules regularized by the KL divergence. This design enables efficient inference while inherently filtering out noise. Experiments on MNIST, Fashion-MNIST, SVHN and CIFAR-10 show that IBCapsNet matches CapsNet in clean-data accuracy (achieving 99.41% on MNIST and 92.01% on SVHN), yet significantly outperforms it under four types of synthetic noise - demonstrating average improvements of +17.10% and +14.54% for clamped additive and multiplicative noise, respectively. Moreover, IBCapsNet achieves 2.54x faster training and 3.64x higher inference throughput compared to CapsNet, while reducing model parameters by 4.66%. Our work bridges information-theoretic representation learning with capsule networks, offering a principled path toward robust, efficient, and interpretable deep models. Code is available at https://github.com/cxiang26/IBCapsnet