🤖 AI Summary
To address overfitting in estimating interaction coefficients under high-dimensional sparse settings, this paper proposes a structured regularization framework: each pairwise interaction coefficient is explicitly parameterized as the inner product of low-dimensional latent vectors associated with the corresponding features; a low-rank structural prior is imposed, and optimization proceeds via gradient-based maximum likelihood estimation. This work introduces, for the first time in interpretable linear models, explicit latent-variable parameterization for interaction terms—thereby ensuring both theoretical interpretability and strong generalization performance. Experiments on synthetic and real-world datasets—where the number of interactions vastly exceeds the sample size—demonstrate that the method achieves significantly higher predictive accuracy than elastic net and factorization machines. Moreover, the learned latent vectors enable interpretable visualization of feature relationships, facilitating post-hoc analysis of interaction structures.
📝 Abstract
Some of the simplest, yet most frequently used predictors in statistics and machine learning use weighted linear combinations of features. Such linear predictors can model non-linear relationships between features by adding interaction terms corresponding to the products of all pairs of features. We consider the problem of accurately estimating coefficients for interaction terms in linear predictors. We hypothesize that the coefficients for different interaction terms have an approximate low-dimensional structure and represent each feature by a latent vector in a low-dimensional space. This low-dimensional representation can be viewed as a structured regularization approach that further mitigates overfitting in high-dimensional settings beyond standard regularizers such as the lasso and elastic net. We demonstrate that our approach, called LIT-LVM, achieves superior prediction accuracy compared to elastic net and factorization machines on a wide variety of simulated and real data, particularly when the number of interaction terms is high compared to the number of samples. LIT-LVM also provides low-dimensional latent representations for features that are useful for visualizing and analyzing their relationships.