Towards Explaining Deep Neural Network Compression Through a Probabilistic Latent Space

📅 2024-02-29

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

241K/year

🤖 AI Summary

Deep neural network (DNN) compression lacks a rigorous theoretical foundation. Method: This paper proposes a theoretical framework grounded in the probabilistic latent space of weights, introducing— for the first time—the AP2 and AP3 projection paradigms, and establishing a rigorous link between layer-wise structural probabilistic similarity and compression performance. Leveraging information theory (specifically KL divergence), it models the intrinsic nature of sparsity, derives the existence and bounds of optimal sparsity, and elucidates the convergence mechanism of post-pruning fine-tuning. The approach integrates probabilistic modeling, latent-space analysis, and empirical validation. Results: Experiments on AlexNet, ResNet-50, VGG-16, and CIFAR-10/100 demonstrate strong correlations (r > 0.92) between AP2/AP3 metrics and actual sparsity levels as well as fine-tuning accuracy; theoretical predictions align closely with empirical outcomes. The core contribution is the first interpretable, quantifiable, and empirically verifiable probabilistic–information-theoretic foundation for DNN compression.

Technology Category

Application Category

📝 Abstract

Despite the impressive performance of deep neural networks (DNNs), their computational complexity and storage space consumption have led to the concept of network compression. While DNN compression techniques such as pruning and low-rank decomposition have been extensively studied, there has been insufficient attention paid to their theoretical explanation. In this paper, we propose a novel theoretical framework that leverages a probabilistic latent space of DNN weights and explains the optimal network sparsity by using the information-theoretic divergence measures. We introduce new analogous projected patterns (AP2) and analogous-in-probability projected patterns (AP3) notions for DNNs and prove that there exists a relationship between AP3/AP2 property of layers in the network and its performance. Further, we provide a theoretical analysis that explains the training process of the compressed network. The theoretical results are empirically validated through experiments conducted on standard pre-trained benchmarks, including AlexNet, ResNet50, and VGG16, using CIFAR10 and CIFAR100 datasets. Through our experiments, we highlight the relationship of AP3 and AP2 properties with fine-tuning pruned DNNs and sparsity levels.

Problem

Research questions and friction points this paper is trying to address.

Explaining DNN compression via probabilistic latent space

Linking layer properties (AP3/AP2) to network performance

Theoretical analysis of compressed network training process

Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic latent space explains DNN compression

AP2 and AP3 notions relate to network performance

Information-theoretic divergence measures optimal sparsity

🔎 Similar Papers

MCNC: Manifold-Constrained Reparameterization for Neural Compression