LIT-LVM: Structured Regularization for Interaction Terms in Linear Predictors using Latent Variable Models

📅 2025-06-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address overfitting in estimating interaction coefficients under high-dimensional sparse settings, this paper proposes a structured regularization framework: each pairwise interaction coefficient is explicitly parameterized as the inner product of low-dimensional latent vectors associated with the corresponding features; a low-rank structural prior is imposed, and optimization proceeds via gradient-based maximum likelihood estimation. This work introduces, for the first time in interpretable linear models, explicit latent-variable parameterization for interaction terms—thereby ensuring both theoretical interpretability and strong generalization performance. Experiments on synthetic and real-world datasets—where the number of interactions vastly exceeds the sample size—demonstrate that the method achieves significantly higher predictive accuracy than elastic net and factorization machines. Moreover, the learned latent vectors enable interpretable visualization of feature relationships, facilitating post-hoc analysis of interaction structures.

Technology Category

Application Category

📝 Abstract
Some of the simplest, yet most frequently used predictors in statistics and machine learning use weighted linear combinations of features. Such linear predictors can model non-linear relationships between features by adding interaction terms corresponding to the products of all pairs of features. We consider the problem of accurately estimating coefficients for interaction terms in linear predictors. We hypothesize that the coefficients for different interaction terms have an approximate low-dimensional structure and represent each feature by a latent vector in a low-dimensional space. This low-dimensional representation can be viewed as a structured regularization approach that further mitigates overfitting in high-dimensional settings beyond standard regularizers such as the lasso and elastic net. We demonstrate that our approach, called LIT-LVM, achieves superior prediction accuracy compared to elastic net and factorization machines on a wide variety of simulated and real data, particularly when the number of interaction terms is high compared to the number of samples. LIT-LVM also provides low-dimensional latent representations for features that are useful for visualizing and analyzing their relationships.
Problem

Research questions and friction points this paper is trying to address.

Estimating coefficients for interaction terms in linear predictors accurately
Addressing overfitting in high-dimensional settings with structured regularization
Providing low-dimensional latent representations for feature visualization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent variable models for interaction terms
Low-dimensional structured regularization
Superior accuracy in high-dimensional settings
🔎 Similar Papers
No similar papers found.