A Unified Probabilistic Framework for Dictionary Learning with Parsimonious Activation

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional dictionary learning imposes sparsity constraints on individual samples, neglecting the shared activation mechanism of atoms across the entire dataset—leading to redundant dictionaries and poor generalization. To address this, we propose a unified probabilistic dictionary learning framework that enforces global sparsity at the atom level via row-wise $L_infty$-norm regularization and a Beta-Bernoulli prior. This formulation provides a Bayesian interpretation of sparsity control and establishes theoretical connections to minimum description length and path learning. The model integrates $L_1$ regularization, variational inference, and Bayesian hyperparameter optimization, balancing interpretability and computational efficiency. Experiments demonstrate a 20% reduction in reconstruction error; moreover, only ~10% of the atoms required by conventional methods suffice to achieve higher sparsity and superior reconstruction quality—validating both empirical effectiveness and theoretical consistency.

Technology Category

Application Category

📝 Abstract
Dictionary learning is traditionally formulated as an $L_1$-regularized signal reconstruction problem. While recent developments have incorporated discriminative, hierarchical, or generative structures, most approaches rely on encouraging representation sparsity over individual samples that overlook how atoms are shared across samples, resulting in redundant and sub-optimal dictionaries. We introduce a parsimony promoting regularizer based on the row-wise $L_infty$ norm of the coefficient matrix. This additional penalty encourages entire rows of the coefficient matrix to vanish, thereby reducing the number of dictionary atoms activated across the dataset. We derive the formulation from a probabilistic model with Beta-Bernoulli priors, which provides a Bayesian interpretation linking the regularization parameters to prior distributions. We further establish theoretical calculation for optimal hyperparameter selection and connect our formulation to both Minimum Description Length, Bayesian model selection and pathlet learning. Extensive experiments on benchmark datasets demonstrate that our method achieves substantially improved reconstruction quality (with a 20% reduction in RMSE) and enhanced representation sparsity, utilizing fewer than one-tenth of the available dictionary atoms, while empirically validating our theoretical analysis.
Problem

Research questions and friction points this paper is trying to address.

Promoting row-wise sparsity in dictionary learning
Reducing redundant atom activation across samples
Connecting probabilistic priors with regularization parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parsimony promoting regularizer using row-wise L_infinity norm
Probabilistic model with Beta-Bernoulli priors for Bayesian interpretation
Optimal hyperparameter selection connecting MDL and Bayesian model selection
🔎 Similar Papers
No similar papers found.
Z
Zihui Zhao
Tsinghua University, Institute of Data and Information, Shenzhen Key Laboratory of Ubiquitous Data Enabling
Y
Yuanbo Tang
Tsinghua University, Institute of Data and Information, Shenzhen Key Laboratory of Ubiquitous Data Enabling
J
Jieyu Ren
University of Chinese Academy of Sciences, Kavli Institute for Theoretical Sciences
Xiaoping Zhang
Xiaoping Zhang
China National Bamboo Research Center
BambooSoil ecologyMetagenomics
Y
Yang Li
Tsinghua University, Institute of Data and Information, Shenzhen Key Laboratory of Ubiquitous Data Enabling